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Abstract 

We study the ergodic properties of finite-dimensional systems of SDEs driven by 
non-degenerate additive fractional Brownian motion with arbitrary Hurst parameter 
H £ (0, 1). A general framework is constructed to make precise the notions of 
"invariant measure" and "stationary state" for such a system. We then prove under 
rather weak dissipativity conditions that such an SDE possesses a unique stationary 
solution and that the convergence rate of an arbitrary solution towards the stationary 
one is (at least) algebraic. A lower bound on the exponent is also given. 
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1 Introduction and main result 

In this paper, we investigate the long-time behaviour of stochastic differential equations 
driven by fractional Brownian motion. Fractional Brownian motion (or fBm for short) 
is a centred Gaussian process satisfying Bh{0) ~ and 



E\BH{t)-BH{st = \t-s\^'', t,s>0. 



(1.1) 
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where H, the Hurst parameter, is a real number in the range H G (0, 1). When H = ^, 
one recovers of course the usual Brownian motion, so this is a natural one-parameter 
family of generalisations of the "standard" Brownian motion. It follows from (1.1) that 
fBm is also self-similar, but with the scaling law 

t Bniat) w t ^ a^Bnit) , 

where « denotes equivalence in law. Also, the sample paths of Bh are a-Holder con- 
tinuous for every a < H. The main difference between fBm and the usual Brownian 
motion is that it is neither Markovian, nor a semi-martingale, so most standard tools 
from stochastic calculus cannot be applied to its analysis. 

Our main motivation is to tackle the problem of ergodicity in non-Markovian sys- 
tems. Such systems arise naturally in several situations. In physics, stochastic forces 
are used to describe the interaction between a (small) system and its (large) environ- 
ment. There is no a-priori reason to assume that the forces applied by the environment 
to the system are independent over disjoint time intervals. In statistical mechanics, 
for example, a non-Markovian noise term appears when one attempts to derive the 
Langevin equation from first principles [JP97, Ris89]. Self-similar stochastic processes 
like fractional Brownian motion appear naturally in hydrodynamics [MVN68]. It ap- 
pears that fractional Brownian motion is also useful to model long-time correlations in 
stock markets [DHPDOO, 0H99]. 

Little seems to be known about the long-time behaviour of non-Markovian sys- 
tems. In the case of the non-Markovian Langevin equation (which is not covered by 
the results in this paper due to the presence of a delay term), the stationary solution 
is explicitly known to be distributed according to the usual equilibrium Gibbs mea- 
sure. The relaxation towards equilibrium is a very hard problem that was solved in 
[JP97, JP98]. It is however still open in the non-equilibrium case, where the invariant 
state can not be guessed a-priori. One well-studied general framework for the study of 
systems driven by noise with extrinsic memory like the ones considered in this paper is 
given by the theory of Random Dynamical Systems (see the monograph [Arn98] and 
the reference list therein). In that framework, the existence of random attractors, and 
therefore the existence of invariant measures seems to be well-understood. On the other 
hand, the problem of uniqueness (in an appropriate sense, see the comment following 
Theorem 1.3 below) of the invariant measure on the random attractor seems to be much 
harder, unless one can show that the system possesses a unique stochastic fixed point. 
The latter situation was studied in [MS02] for infinite-dimensional evolution equations 
driven by fractional Brownian motion. 

The reasons for choosing fBm as driving process for (SDE) below are twofold. 
First, in particular when H > ^, fractional Brownian motion presents genuine long- 
time correlations that persist even under rescaling. The second reason is that there 
exist simple, explicit formulae that relate fractional Brownian motion to "standard" 
Brownian motion, which simplifies our analysis. We will limit ourselves to the case 
where the memory of the system comes entirely from the driving noise process, so we 
do not consider stochastic delay equations. 

We will only consider equations driven by non-degenerate additive noise, i.e. we 
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consider equations of the form 

dxt = f(xt) dt + a dBait) , € R" , (SDE) 

where Xt £ R", / : R" R", Bh is an n-dimensional fractional Brownian motion 
with Hurst parameter H, and cr is a constant and invertible n x n matrix. Of course, 
(SDE) should be interpreted as an integral equation. 

In order to ensure the existence of globally bounded solutions and in order to have 
some control on the speed at which trajectories separate, we make throughout the paper 
the following assumptions on the components of (SDE): 

Al Stability. There exist constants C,^^ > such that 

{fix) - f{y),x- y) < mm{Ct' - C^'\\x - y\\\ C^'\\x - y\\'} , 
for every x,y £ R". 

A2 Growth and regularity. There exist constants C, TV > such that / and its 
derivative satisfy 

||/(x)|l<C(l + 11x11)^, \\Df(x)\\<Cil + \\x\\f , 

for every x G R". 
A3 Non-degeneracy. The n x n matrix a is invertible. 

Remark 1.1 We can assume that \\a\\ < 1 without any loss of generality. This as- 
sumption will be made throughout the paper in order to simplify some expressions. 

One typical example that we have in mind is given by 

fix) = X - x^ , X eR , 

or any polynomial of odd degree with negative leading coefficient. Notice that / satis- 
fies A1-A2, but that it is not globally Lipschitz continuous. 

When the Hurst parameter H of the fBm driving (SDE) is bigger than 1/2, more 
regularity for / is required, and we will then sometimes assume that the following 
stronger condition holds instead of A2: 

A2' Strong regularity. The derivative of / is globally bounded. 

Our main result is that (SDE) possesses a unique stationary solution. Furthermore, 
we obtain an explicit bound showing that every (adapted) solution to (SDE) converges 
towards this stationary solution, and that this convergence is at least algebraic. We 
make no claim concerning the optimality of this bound for the class of systems under 
consideration. Our results are slightly different for small and for large values of H, so 
we state them separately. 

Theorem 1.2 (Small Hurst parameter) Let H e (0, ^) and let f and a satisfy Al- 
A3. Then, for every initial condition, the solution to (SDE) converges towards a 
unique stationary solution in the total variation norm. Furthermore, for every 7 < 
maXa<ff a(l — 2a), the difference between the solution and the stationary solution is 
bounded by C-yt^^ for large t. 
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Theorem 1.3 (Large Hurst parameter) Let He (5,1) and let f and a satisfy Al- 
A3 andA2'. Then, for every initial condition, the solution to (SDE) converges towards 
a unique stationary solution in the total variation norm. Furthermore, for every 7 < |, 
the difference between the solution and the stationary solution is bounded by C^t~'^ 
for large t. 

Remark 1.4 The "uniqueness" part of these statements should be understood as uni- 
queness in law in the class of stationary solutions adapted to the natural filtration in- 
duced by the two-sided fBm that drives the equation. There could in theory be other 
stationary solutions, but they would require knowledge of the future to determine the 
present, so they are usually discarded as unphysical. 

Even in the context of Markov processes, similar situations do occur One can 
well have uniqueness of the invariant measure, but non-uniqueness of the stationary 
state, although other stationary states would have to foresee the future. In this sense, 
the notion of uniqueness appearing in the above statements is similar to the notion of 
uniqueness of the invariant measure for Markov processes. (See e.g. [Arn98], [Cra91] 
and [Cra02] for discussions on invariant measures that are not necessarily measurable 
with respect to the past.) 

Remark 1.5 The case _ff = ^ is not covered by these two theorems, but it is well- 
known that the convergence toward the stationary state is exponential in this case (see 
for example [MT94]). In both cases, the word "total variation" refers to the total varia- 
tion distance between measures on the space of paths, see also Theorem 6.1 below for 
a rigorous formulation of the results above. 

1.1 Idea of proof and structure of the paper 

Our first task is to make precise the notions of "initial condition", "invariant measure", 
"uniqueness", and "convergence" appearing in the formulation of Theorems 1 .2 and 
1.3. This will be achieved in Section 2 below, where we construct a general framework 
for the study of systems driven by non-Markovian noise. Section 3 shows how (SDE) 
fits into that framework. 

The main tool used in the proof of Theorems 1.2 and 1.3 is a coupling construction 
similar in spirit to the ones presented in [Mat02, Hai02]. More precisely, we first 
show by some compactness argument that there exists at least one invariant measure 
/X* for (SDE). Then, given an initial condition distributed according to some arbitrary 
measure ^, we construct a "coupling process" {xt, yt) on R" x R" with the following 
properties: 

1. The process Xt is a solution to (SDE) with initial condition jj,^. 

2. The process yt is a solution to (SDE) with initial condition fi. 

3. The random time t^o — min{i | Xg = ys Vs > f} is almost surely finite. 

The challenge is to introduce correlations between Xg and j/^ in precisely such a way 
that Too is finite. If this is possible, the uniqueness of the invariant measure follows 
immediately. Bounds on the moments of Too furthermore translate into bounds on 
the rate of convergence towards this invariant measure. In Section 4, we expose the 
general mechanism by which we construct this coupling. Section 5 is then devoted to 
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the precise formulation of the coupHng process and to the study of its properties, which 
will be used in Section 6 to prove Theorems 1.2 and 1.3. We conclude this paper with 
a few remarks on possible extensions of our results to situations that are not covered 
here. 
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2 General theory of stochastic dynamical systems 

In this section, we first construct an abstract framework that can be used to model a 
large class of physically relevant models where the driving noise is stationary. Our 
framework is very closely related to the framework of random dynamical systems 
with however one fundamental difference. In the theory of random dynamical sys- 
tems (RDS), the abstract space used to model the noise part typically encodes the 
future of the noise process. In our framework of "stochastic dynamical systems" (SDS) 
the noise space W typically encodes the past of the noise process. As a consequence, 
the evolution on W will be stochastic, as opposed to the deterministic evolution on Vl 
one encounters in the theory of RDS. This distinction may seem futile at first sight, and 
one could argue that the difference between RDS and SDS is non-existent by adding 
the past of the noise process to Vl and its future to W. 

The additional structure we require is that the evolution on W possesses a unique 
invariant measure. Although this requirement may sound very strong, it is actually 
not, and most natural examples satisfy it, as long as W is chosen in such a way that 
it does not contain information about the future of the noise. In very loose terms, 
this requirement of having a unique invariant measure states that the noise process 
driving our system is stationary and that the Markov process modelling its evolution 
captures all its essential features in such a way that it could not be used to describe 
a noise process different from the one at hand. In particular, this means that there is 
a continuous inflow of "new randomness" into the system, which is a crucial feature 
when trying to apply probabilistic methods to the study of ergodic properties of the 
system. This is in opposition to the RDS formalism, where the noise is "frozen", as 
soon as an element of Vl is chosen. 

From the mathematical point of view, we will consider that the physical process 
we are interested in lives on a "state space" X and that its driving noise belongs to 
a "noise space" W. In both cases, we only consider Polish {i.e. complete, separable, 
and metrisable) spaces. One should think of the state space as a relatively small space 
which contains all the information accessible to a physical observer of the process. 
The noise space should be thought of as a much bigger abstract space containing all 
the information needed to construct a mathematical model of the driving noise up to 
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a certain time. The information contained in the noise space is not accessible to the 
physical observer. 

Before we state our definition of a SDS, we will recall several notations and def- 
initions, mainly for the sake of mathematical rigour. The reader can safely skip the 
next subsection and come back to it for reference concerning the notations and the 
mathematically precise definitions of the concepts that are used. 

2.1 Preliminary definitions and notations 

First of all, recall he definition of a transition semigroup: 

Definition 2.1 Let {£, E) be a Polish space endowed with its Borel a-field. A transition 
semigroup Vt on £ is a family of maps T^t : £ x E ^ [0, 1] indexed by t € [0, oo) such 
that 

i) for every x € £, the map A i— i- Vt(x, A) is a probability measure on £ and, for 
every A € E, the map x ^ Vtix, A) is E-measurable, 

ii) one has the identity 

Vs+t{x, ^) = ^ 'Psiy, A) Vtix, dy) , 

for every s, t > 0, every x ^ £, and every A G E. 
Hi) Vo{x, ■) = Sx for every x Cz £. 

We will freely use the notations 

(Vti^)(x) = V(2/) Vt(x, dy) , irtp.)(A) = ^ Vtix, A) ,^(dx) , 

where ?/; is a measurable function on £ and is a measure on £. 

Since we will always work with topological spaces, we will require our transition 
semigroups to have good topological properties. Recall that a sequence {/i„} of mea- 
sures on a topological space £ is said to converge toward a limiting measure p in the 
weak topology if 

/ il^ix) fin(dx) / ip(x) p{dx) , VV' e Cb{£) , 
Js Js 

where Cb{£) denotes the space of bounded continuous functions from £ into R. In the 
sequel, we will use the notation Mi{£) to denote the space of probability measures on 
a Polish space £, endowed with the topology of weak convergence. 

Definition 2.2 A transition semigroup Vt on a Polish space £ is Feller if it maps Cb{£) 
into Cb(£). 

Remark 2.3 This definition is equivalent to the requirement that x i— *■ Vt{x, ■ ) is con- 
tinuous from £ to Mi(£). As a consequence. Feller semigroups preserve the weak 
topology in the sense that if /i„ ^ /i in Mi(£), then VtPn Vtp in Mi(£) for every 
given t. 
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Now that we have defined the "good" objects for the "noisy" part of our construc- 
tion, we turn to the trajectories on the state space. We are looking for a space which has 
good topological properties but which is large enough to contain most interesting ex- 
amples. One such space is the space of cadlag paths (continu a droite, limite a gauche 
— continuous on the right, limits on the left), which can be turned into a Polish space 
when equipped with a suitable topology. 

Definition 2.4 Given a Polish space £ and a positive number T, the space I?([0, T] , £) 
is the set of functions / : [0, T] — > £ that are right-continuous and whose left-limits 
exist at every point. A sequence {/n}neN converges to a limit f if and only if there 
exists a sequence {A„} of continuous and increasing functions A„ : [0,T] — > [0,T] 
satisfying A„(0) = 0, A„(T) = T, and such that 



Um sup 

"^°°0<s<t<T 



A„(t) - A„(s) 

log 



t - s 



0, (2.1) 



and 

lim sup difn(t)J(Xnm = , (2.2) 

where d is any totally bounded metric on £ which generates its topology. 

The space P(R_|_, £) is the space of all functions from to £ such that their re- 
strictions to [0, T] are in I'([0, T] , £)for all T > 0. A sequence converges in I?(R+ , £) 
if there exists a sequence {A„} of continuous and increasing functions A„ : R.)- R+ 
satisfying A„(0) ~ and such that (2.1) and (2.2) hold. 

It can be shown (see e.g. [EK86] for a proof) that the spaces 'D([0,T],£) and 
I?(R+, £) are Polish when equipped with the above topology (usually called the Sko- 
rohod topology). Notice that the space 'D{[0,T],£) has a natural embedding into 
I?(R-i-,5) by setting f{t) ~ f{T) for t > T and that this embedding is continuous. 
However, the restriction operator from 'D{R+,£) to 'D([0,T],£) is not continuous, 
since the topology on I?([0, T],£) imposes that fn{T) — > /(T), which is not imposed 
by the topology on I?(R+ , £). 

In many interesting situations, it is enough to work with continuous sample paths, 
which live in much simpler spaces: 



Definition 2.5 Given a Polish space £ and a positive number T, the space C([0, T] , £) 
is the set of continuous functions / : [0,T] ^ £ equipped with the supremum norm. 

The space C(R-|-, £) is the space of all functions from R-|_ to £ such that their re- 
strictions to [0, T] are in C([0, T],£)for all T > 0. A sequence converges in C(R+, £) 
if all its restrictions converge. 

It is a standard result that the spaces C([0, T],£) and C(R+, £) are Polish if £ is 
Polish. We can now turn to the definition of the systems we are interested in. 
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2.2 Definition of a SDS 

Let us recall the following standard notations. Given a product space X x W, we denote 
by n^f and IIw the maps that select the first (resp. second) component of an element. 
Also, given two measurable spaces E and T, a measurable map f : £ ^ T, and a 
measure /i on we define the measure on in the natural way by — /io/^^. 
We first define the class of noise processes we will be interested in: 

Definition 2.6 A quadruple (W, {7^t}(>o, Pto, {^t}t>o) is called a stationary noise 
process if it satisfies the following: 

i) W is a Polish space, 

ii) Vt is a Feller transition semigroup on W, which accepts P^, as its unique invariant 
measure, 

Hi) The family {9t}t>o is a semiflow of measurable maps on W satisfying the prop- 
erty 9fVf(x, •) = 5x for every x € W. 

This leads to the following definition of SDS, which is intentionally kept as close 
as possible to the definition of RDS in [Arn98, Def. 1.1.1]: 

Definition 2.7 A stochastic dynamical system on the Polish space X over the station- 
ary noise process (W, {'Pt}t>o,^w,{&t}t>Q) is a mapping 

ip ■.'Rj^ X X xW ^ X , {t,x,w)^ ^t{x, w) , 

with the following properties: 

(SDSl) Regularity of paths: For every T > 0, x G X, and w G W, the map 
^t{x, w) : [0, T] X defined by 

^t{x, w){t) = iptix, dr-tw) , 

belongs to V{[0,T],X). 

(SDS2) Continuous dependence: The maps (x, w) i— s- ^t(x, w) are continuous from 
X xW to V([0,T],X) for every T > 0. 

(SDS3) Cocycle property: The family of mappings tpt satisfies 

ipoix, w) = X , 
ips+t(x,w) = ips{ipt{x,9sw),w) , (2.3) 

for all s,t > 0, all x e X, and all w G W. 

Remark 2.8 The above definition is very close to the definition of Markovian random 
dynamical system introduced in [Cra91]. Beyond the technical differences, the main 
difference is a shift in the viewpoint: a Markovian RDS is built on top of a RDS, so 
one can analyse it from both a semigroup point of view and a RDS point of view. In the 
case of a SDS as defined above, there is no underlying RDS (although one can always 
construct one), so the semigroup point of view is the only one we consider. 
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Remark 2.9 The cocycle property (2.3) looks different from the cocycle property for 
random dynamical systems. Actually, in our case ipisa backward cocycle for 9t, which 
is reasonable since, as a "left inverse" for Vt, dt actually pushes time backward. Notice 
also that, unlike in the definition of RDS, we require some continuity property with 
respect to the noise to hold. This continuity property sounds quite restrictive, but it is 
actually mainly a matter of choosing a topology on W, which is in a sense "compatible" 
with the topology on X. 

Similarly, we define a continuous (where "continuous" should be thought of as 
continuous with respect to time) SDS by 

Definition 2.10 A SDS is said to be continuous ;/ X'([0, T], A") can be replaced by 
C([0, T], X) in the above definition. 

Remark 2.11 One can check that the embeddings C([0, T], A") r>([0, T], A") and 
C(R+, A") ^ I?(R+, A") are continuous, so a continuous SDS also satisfies Defini- 
tion 2.7 of a SDS. 

Given a SDS as in Definition 2.7 and an initial condition xq E X, we now turn 
to the construction of a stochastic process with initial condition xo constructed in a 
natural way from ip. First, given < > and (x, w) G X x W, we construct a probability 
measure Qtix, w; • ) on A" x W by 



where Sx denotes the delta measure located at x. The following result is elementary: 

Lemma 2.12 Let ip be a SDS on X over (W, {Vt}t>o-,Pw-, {&t}t>o) <^nd define the 
family of measures Qt{x, w; ■) by (2.4). Then Qt is a Feller transition semigroup on 
X X yV. Furthermore, it has the property that //Tl^/i = Pwfor a measure jion X x W, 
then n^Qt/i = P^,. 

Proof. The fact that 11^ Qt/i = Pu, follows from the invariance of V^, under Vt- We 
now check that Qt is a Feller transition semigroup. Conditions i) and Hi) follow imme- 
diately from the properties of p. The continuity of Qtix, w; ■ ) with respect to (x, w) is 
a straightforward consequence of the facts that Vt is Feller and that (x, w) ptix, w) 
is continuous (the latter statement follows from (SDS2) and the definition of the topol- 
ogyonV{[0,t],X)). 

It thus remains only to check that the Chapman-Kolmogorov equation holds. We 
have from the cocycle property: 




(2.4) 
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The claim then follows from the property OlVsiw" , dw') = S.^jnidw') by exchanging 
the order of integration. □ 

Remark 2.13 Actually, (2.4) defines the evolution of the one-point process generated 
by ^p. The n-points process would evolve according to 



...,Xn,W]AiX...X 



r. n 

^{A,)Vtiw,dw') 



One can check as above that this defines a Feller transition semigroup on A"" x W. 
This lemma suggests the following definition: 

Definition 2.14 Let (p be a SDS as above. Then a probability measure fi on X x W 
is called a generalised initial condition /or ip i/II^/i = Pu,. We denote by the 
space of generalised initial conditions endowed with the topology of weak convergence. 
Elements o/M^ that are of the form n = x V^^for some x £ X will be called initial 
conditions. 

Given a generalised initial condition /i, it is natural construct a stochastic process 
(st, Wt) on A" X yy by drawing its initial condition according to i^l and then evolving it 
according to the transition semigroup Qt. The marginal xt of this process on X will be 
called the process generated by ffor /i. We will denote by QyU the law of this process 
(i.e. Q/i is a measure on I?(R+, X) in the general case and a measure on C(R-)-, X) in 
the continuous case). More rigorously, we define for every T > the measure QtM on 
VmTlX)hy 

where $t is defined as in (SDSl). By the embedding V{[0,T],X) ^ V(R+,X), 
this actually gives a family of measures on P(R+, A"). It follows from the cocycle 
property that the restriction to I?([0,T], A") of Qt'A* with T' > T is equal to QtM- 
The definition of the topology on P(R+ , X) does therefore imply that the sequence 
Qt/^ converges weakly to a unique measure on 2?(R+, X) that we denote by Qfj,. A 
similar argument, combined with (SDS2) yields 

Lemma 2.15 Let ip be a SDS. Then, the operator Q as defined above is continuous 
from to Mi(X>(R+ , X)). □ 

This in turn motivates the following equivalence relation: 

Definition 2.16 Two generalised initial conditions fi and v of a SDS ip are equivalent 

if the processes generated by /i and v are equal in law. In short, fi ^ v <^ Q/_t = Qv. 

The physical interpretation of this notion of equivalence is that the noise space contains 
some redundant information that is not required to construct the future of the system. 
Note that this does not necessarily mean that the noise space could be reduced in order 
to have a more "optimal" description of the system. For example, if the process xt gen- 
erated by any generalised initial condition is Markov, then all the information contained 
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in W is redundant in the above sense (i.e. /i and i/ are equivalent if n|^/i = H^-v). This 
does of course not mean that W can be entirely thrown away in the above description 
(otherwise, since the map ip is deterministic, the evolution would become determinis- 
tic). 

The main reason for introducing the notion of SDS is to have a framework in which 
one can study ergodic properties of physical systems with memory. It should be noted 
that it is designed to describe systems where the memory is extrinsic, as opposed to 
systems with intrinsic memory like stochastic delay equations. We present in the next 
subsection a few elementary ergodic results in the framework of SDS. 

2.3 Ergodic properties 

In the theory of Markov processes, the main tool for investigating ergodic properties 
is the invariant measure. In the setup of SDS, we say that a measure ^ on X x W 
is invariant for the SDS ip if it is invariant for the Markov transition semigroup Qt 
generated by tp. We say that a measure yu on A" x W is stationary for ip if one has 

Qt^J, ^J,, Vt > , 

i.e. if the process on X generated by /i is stationary. Following our philosophy of con- 
sidering only what happens on the state space X, we should be interested in stationary 
measures, disregarding completely whether they are actually invariant or not. In do- 
ing so, we could be afraid of loosing many convenient results from the well-developed 
theory of Markov processes. Fortunately, the following lemma shows that the set of 
invariant measures and the set of stationary measures are actually the same, when quo- 
tiented by the equivalence relation of Definition 2.16. 

Proposition 2.17 Let ip be a SDS and let fi be a stationary measure for ip. Then, there 
exists a measure fi^, ^ fi which is invariant for ip. 

Proof. Define the ergodic averages 

T^TH^^ I Qtf^dt . (2.5) 
^ Jo 

Since /i is stationary, we have IV^TZtH = n^/i for every T. Furthermore, H'^TZtIJ^ = 
Puj for every T, therefore the sequence of measures 7?.tM is tight on A" x W. Let ji^, be 
any of its accumulation points in Mi (A" x W). Since Qt is Feller, /x,^ is invariant for 
Qt and, by Lemma 2.15, one has i^l^, ~ /i. □ 

From a mathematical point of view, it may in some cases be interesting to know 
whether the invariant measure /i* constructed in Proposition 2.17 is uniquely deter- 
mined by 11. From an intuitive point of view, this uniqueness property should hold if 
the information contained in the trajectories on the state space X is sufficient to re- 
construct the evolution of the noise. This intuition is made rigorous by the following 
proposition. 
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Proposition 2.18 Let if be a SDS, define as the u-field on W generated by the map 
^t{x, • ) : W ^ X'([0, T],X), and set Wt = f\^^x ^t- Assume that Wr C Wt' 
for T < T' and that W = Vt>o is equal to the Borel a-field on W. Then, for fii 
and fi2 two invariant measures, one has the implication ^ fJ-2 ^ fJ-i = M2- 

Proof. Assume fii ~ ^2 are two invariant measures for ip. Since Wt C Wt' if 
T < T', their equality follows if one can show that, for every T > 0, 

E(/^i |X® Wt) =E(^2 |X® Wt) , (2.6) 

where X denotes the Borel cr-field on X. 

Since /ii ~ ^2, one has in particular Il^/ii = Il*^fi2, so let us call this measure 
v. Since W is Polish, we then have the disintegration x 1-^ /xf , yielding formally 
Hiidx, dw) = fj.f{dw) v{dx), where //f are probability measures on W. (See [GS77, 
p. 196] for a proof.) Fix T > and define the family /if of probability measures on 
Wby 

M-'^= / Vt{w,■)^l'',{dw) . 
Jw 

With this definition, one has 

QT^i^ = / ($T(a;, • )>r^) ''(dx) ■ 
Jx 

Let eo : I?([0, T],X) X he the evaluation map at 0, then 

E{qT^^^\eo^x) = ($T(a;,-)*Mr^) ' 

for i^-almost every x E X. Since QtMi = Q,tIJ-2, one therefore has 

E(/if^|Wf,) =E(/.^'^|W?,) , (2.7) 

for j/-almost every x E X. On the other hand, the invariance of jii implies that, for 
every A e X and every B S Wt, one has the equality 

^ii(A X B)^ / xa{'Pt(x,w)) ^i^-^ {dw) v{dx) . 
Jx J B 

Since (firix, ■ ) is W^-measurable and B £ Wf,, this is equal to 

/ / XA{^T(x,w))^^l''^■^ \ Wi^)(dw)iy(dx) . 
JX Jb 

Thus (2.7) implies (2.6) and the proof of Proposition 2.18 is complete. □ 

The existence of an invariant measure is usually established by finding a Lyapunov 
function. In this setting, Lyapunov functions are given by the following definition. 

Definition 2.19 Let tp be a SDS and let F : X ^ [0, 00) be a continuous function. 
Then F is a Lyapunov function for ip if it satisfies the following conditions: 
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(LI) The set F ^([0, C]) is compact for every C £ [0, oo). 
(L2) There exist constants C and 7 > such that 



I Fix) iQtfi)(dx, dw)<C + e-^* / F(x) iU*xfi)(dx) 



(2.8) 




for every t > and every generalised initial condition fi such that the right-hand 
side is finite. 

It is important to notice that one does not require F to be a Lyapunov function 
for the transition semigroup Qt, since (2.8) is only required to hold for measures ^ 
satisfying Il^/^t = Vw One nevertheless has the following result: 

Lemma 2.20 Let ip be a SDS. If there exists a Lyapunov function F for (f, then there 
exists also an invariant measure fii^for ip, which satisfies 



Proof. Let x <E A" be an arbitrary initial condition, set fj, = J^; x P„,, and define 
the ergodic averages 7?.tM in (2.5). Combining (LI) and (L2) with the fact that 
n^T^TM = PtiM one immediately gets the tightness of the sequence {TZtIJ-}- By the 
standard Krylov-Bogoloubov argument, any limiting point of {TZtIJ-} is an invariant 
measure for (p. The estimate (2.9) follows from (2.8), combined with the fact that F is 
continuous. □ 

This concludes our presentation of the abstract framework in which we analyse the 
ergodic properties of (SDE). 



In this section, we construct a continuous stochastic dynamical system which yields 
the solutions to (SDE) in an appropriate sense. 

First of all, let us discuss what we mean by "solution" to (SDE). 

Definition 3.1 Let {xt}t>o be a stochastic process with continuous sample paths. We 
say that Xt is a solution to (SDE) if the stochastic process N{t) defined by 



is equal in law to uBnif), where a is as in (SDE) and Bnit) is a n- dimensional fBm 
with Hurst parameter H. 

We will set up our SDS in such a way that, for every generalised initial condition 
H, the canonical process associated to the measure QyU is a solution to (SDE). This will 
be the content of Proposition 3.11 below. In order to achieve this, our main task is to 
set up a noise process in a way which complies to Definition 2.6. 




(2.9) 



3 Construction of the SDS 
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3.1 Representation of the fBm 

In this section, we give a representation of the fBm Bnit) with Hurst parameter H G 
(0, 1) which is suitable for our analysis. Recall that, by definition, Buit) is a centred 
Gaussian process satisfying Bh{Q) ~ and 

E|SH(t)-Bff(s)|2 = |t-s|2ff . (3.2) 

Naturally, a two-sided fractional Brownian motion by requiring that (3.2) holds for all 
s, i G R. Notice that, unlike for the normal Brownian motion, the two-sided fBm is not 
obtained by gluing two independent copies of the one-sided fBm together at t = 0. We 
have the following useful representation of the two-sided fBm, which is also (up to the 
normalisation constant) the representation used in the original paper [MVN68]. 

Lemma 3.2 Let wit), t e R fee a two-sided Wiener process and let H G (0, 1). Define 
for some constant an the process 

BhU) ^an {-r)"-i {dw(r + 1) - dw(r)) . (3.3) 

J —oo 

Then there exists a choice of uh such that Bnit) is a two-sided fractional Brownian 
motion with Hurst parameter H. □ 

Notation 3.3 Given the representation (3.3) of the fBm with Hurst parameter H, we 
call w the "Wiener process associated to Bh". We also refer to {w{t) : t < 0} as the 
"past" of w and to {w{t) : t > 0} as the "future" of w. We similarly refer to the "past" 
and the "future" of Bh- Notice the notion of future for Bh is different from the notion 
of future for w in terms of cr-algebras, since the future of Bh depends on the past of 
w. 

Remark 3.4 The expression (3.3) looks strange at first sight, but one should actually 
think of Bnit) as being given by Bnit) = Bnit) — Bh{0), where 

/t 
(t- s)"^i dw{s) ." (3.4) 
-oo 

This expression is strongly reminiscent of the usual representation of the stationary 
Ornstein-Uhlenbeck process, but with an algebraic kernel instead of an exponential 
one. Of course, (3.4) does not make any sense since (t — s)^~2 is not square inte- 
grable. Nevertheless, (3.4) has the advantage of explicitly showing the stationarity of 
the increments for the two-sided fBm. 

3.2 Noise spaces 

In this section, we introduce the family of spaces that will be used to model our noise. 
Denote by C^(R_) the set of C°° function w : (-oo, 0] R satisfying w(0) = 
and having compact support. Given a parameter if e (0, 1), we define for every w S 
C^(R_) the norm 

II II \w{t) - w(s)\ 

\\w\\h = sup — . (3.5) 

t,seR- \t-s\ — {l + \t\ + |s|)2 
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We then define the Banach space Hh to be the closure of C^(R_) under the norm 
II • \\h- The following lemma is important in view of the framework exposed in Sec- 
tion 2: 



Lemma 3.5 The spaces Tin are separable. 

Proof. It suffices to find a norm || • which is stronger than || • \\h and such that the 
closure of C^(R_) under || • ||^ is separable. One example of such a norm is given by 

\\w\\^, = sup ^^f^\tw{t)\. □ 

Notice that it is crucial to define Hh as the closure of under || • ||i/. If we 
defined it simply as the space of all functions with finite || • ||/j-norm, it would not be 
separable. (Think of the space of bounded continuous functions, versus the space of 
continuous functions vanishing at infinity.) 

In view of the representation (3.3), we define the linear operator Vh on functions 
w eC^hy 

{VHw){t) = aH / {-s)"~^{w{s + t)-w{,s))ds , (3.6) 

J —oo 

where an is as in Lemma 3.2. We have the following result: 

Lemma 3.6 Let H Cz (0,1) and let TLh be as above. Then the operator Vh, formally 
defined by (3.6), is continuous from Tin into TCi^h- Furthermore, the operator Vh 
has a bounded inverse, given by the formula 

for some constant satisfying = 7i-H- 

Remark 3.7 The operator T>h is actually (up to a multiplicative constant) a fractional 
integral of order H — ^ which is renormalised in such a way that one gets rid of the 
divergence at — oo. It is therefore not surprising that the inverse of T>h is 

Proof For H = ^, Vh is the identity and there is nothing to prove. We therefore 
assume in the sequel that H ^ ^. 

We first show that Vh is continuous from Hh into Hi-h- One can easily check 
that T>H maps into the set of functions which converge to a constant at — oo. 
This set can be seen to belong to Hi-h by a simple cutoff argument, so it suffices to 
show that \\T)hw\\i-h < Clliulli/ for w G . Assume without loss of generality 
that t > s and define h ^ t — s. We then have 

iVHw)(t) - {Vhw){s) ^an f {(t - r)"-^- - (s - r)^-^) dw(r) 
+ aH / {t- r)"~^ dw{r) . 
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Splitting the integral and integrating by parts yields 

{VHw){t) - {Vhw){s) = -aniH - i) / (s - rf-^wir) - w{s)) dr 

Js-h 

i-t 

+ aH{H-\) / {t-r)"-i{w{r)^w{t))dr 

Jt-2h 

/s — h 
{(t - r)"-i - (s - r)"-i){w{r) - w{s)) dr 
-OO 

+ aH{2h)"-^w(t)~w{s)) 
= Ti + T2 + Ta + T4 . 
We estimate each of these terms separately. For Ti, we have 

\Ti\ < C(l + |s| + / r«-i+^ dr < + \s\ + |t|)^/' . 

The term T2 is bounded by Ch^{l + \s\ + in a similar way. Concerning T3, we 

bound it by 

|T3|<C / [r^-i -{h + r)"-i){w{s~r)~w{s))dr 

Jh 

/•OO 

<Ch r"-^r^{l + \s\ + \r\)^^^ dr 

Jh 

/>oo 

<Ch^{l + \s\)^^^ + Ch / r^-^{h + rf^^ dr 

Jh 

<Ch^{l + \s\ + hf^<Ch'i{l + \s\ + \t\f^ . 

The term T4 is easily bounded by C/i"? (1 + |s| + 1^1)^^^, using the fact that w £ Tin- 
This shows that Vh is bounded from Hh to Hi-h- 

It remains to show that Vh o f i-// is a multiple of the identity. For this, notice 
that if ui e C^, then one has in the notations of [SKM93, pp. 94-95] the following 
identities 

{VHw)(t) = -anTiH + - {l"~^w)(0)) , iJ > i , 

{VHw){t) = -anTiH + ^) {{Dl~ " w) (t) - {Dl'"w)(0)) , H<^. 

Furthermore, (3.6) shows that Vhw = if w is a constant. The claim then follows 
immediately from the fact that if w € and a E (0, 1), one has D"I'^w = w and 
I'^Dlw = w (see [SKM93, Thm. 2.4]). □ 

Since we want to use the operators T>h and X'i-h to switch between Wiener pro- 
cesses and fractional Brownian motions, it is crucial to show that the sample paths of 
the two-sided Wiener process belong to every Hh with probability 1. Actually, what 
we show is that the Wiener measure can be constructed as a Borel measure on Hh- 
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Lemma 3.8 There exists a unique Gaussian measure W on Tin which is such that the 
canonical process associated to it is a time-reversed Brownian motion. 

Proof. We start by showing that the Hh -norm of the Wiener paths has bounded mo- 
ments of all orders. It follows from a generalisation of the Kolmogorov criterion 
[RY99, Theorem 2.1] that 

/ \w(s)-w(t)\ Y ^ 
E sup -tth- ] < oo (3.7) 



s,te[0.2] \s~t\' 

for all p > 0. Since the increments of w are independent, this implies that, for every 
e > 0, there exists a random variable Ci such that 

\Ms) - wit)\ 

sup ^T^nr- — —7 < Ci , (3.8) 

|.-t|<i \s^t\ — {l + \t\ + \s\Y 

with probability 1, and that all the moments of Ci are bounded. We can therefore 
safely assume in the sequel that |t — s| > 1. It follows immediately from (3.8) and the 
triangle inequaUty that there exists a constant C such that 

\wis)-w{t)\ < CCi\t ~ s\{l + \t\ + \s\Y , (3.9) 

whenever |t — s| > 1. Furthermore, it follows from the time-inversion property of the 
Brownian motion, combined with (3.7), that \w\ does not grow much faster than \t\^^^ 
for large values of t. In particular, for every e' > 0, there exists a random variable C2 
such that 

\w(t)\<C2il + \t\)^^'' , VteR, (3.10) 

and that all the moments of C2 are bounded. Combining (3.9) and (3.10), we get (for 
some other constant C) 

l-H l + H H+1 . 1-H 1 / 1-tH 

\w{s) - w{t)\ < CC^ ' |i-s| — (l + |.s| + |t|)— . 

The claim follows by choosing for example £ = e' = (1 — H)/A. 

This is not quite enough, since we want the sample paths to belong to the closure 
of under the norm || • \\h. Define the function 

{s,t) ^ T(s,t) - 



\t~s\ 

By looking at the above proof, we see that we actually proved the stronger statement 
that for every H G (0, 1), one can find a 7 > such that 

r(s,tr\w(s)-w(t)\ 

\\w\ h,^ = sup — < 00 

M |,-t|^(l + |t| + |,|)^ 

with probability 1. Let us call HH.-y the Banach space of functions with finite || • Wn.-y- 
norm. We will show that one has the continuous inclusions: 

HHn ^T^H ^C(R-,R) . (3.11) 
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Let us call W the usual time-reversed Wiener measure on C(R_ , R) equipped with the 
tr-field R generated by the evaluation functions. Since Hh.i is a measurable subset of 
C(R_,R) and \N{TCh,'i) ~ 1' restrict W to a measure on TCh, equipped with 

the restriction R of R. It remains to show that R is equal to the Borel cr-field B on Hh- 
This follows from the fact that the evaluation functions are B-measurable (since they 
are actually continuous) and that a countable number of function evaluations suffices 
to determine the || • ||if-norm of a function. The proof of Lemma 3.8 is thus complete 
if we show (3.11). 

Notice first that the function r(s, t) becomes large when |^ — s| is small or when 
either |t| or \s\ are large, more precisely we have 

r{s,t) > max{|s|, |t|, \t-s\-^} . (3.12) 

Therefore, functions w E Hh,-) are actually more regular and have better growth prop- 
erties than what is needed to have finite || • ||//-norm. Given w with lluijli/.-y < oo and 
any e > 0, we will construct a function w G such that ||w — ?Z>||^f < e. Take two 
C°° functions cpi and (p2 with the following shape: 



-1 ipi(t) -2 -1 ip2it) 




Furthermore, we choose them such that: 

ipi(s) ds = I , 



dip2(t) 



dt 



< 2 



For two positive constants r < 1 and R> 1 to be chosen later, we define 

W(t) = ip2(t/R) [ W(t + s)^El^-M. ds . 



i.e. we smoothen out w at length scales smaller than r and we cut it off at distances 
bigger than R. A straightforward estimate shows that there exists a constant C such 
that 

||w||if,7 < C\\w\\h,j , 

independently of r < 1/4 and R > 1. For 5 > to be chosen later, we then divide the 
quadrant K = {(t, s)\t,s < 0} into three regions: 



Ki = {(t,s)\\t\ + \s\>R}nK, 
A'2 = {(t, s)\\t-s\<5}nK\ A'l , 
K3 = K\ {Ki U K2) . 
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We then bound \\w — w\\h by 

11 ~|| ^ C\\w\\H,-r , \wis) - wis)\ + \wit) - w(t)\ 

\\'w-w\\h< sup — — + sup jriT"' ' 1 — 

(sj)eKiuK2 ''it,sp {s,t)eK3 \t - s\^- (1 + \t\ + \s\)2 

<CiS'' + R-^)\\w\\H,'y + 26^ sup \wit)-w{t)\. 

0<t<R 

By choosing 6 small enough and R large enough, the first term can be made arbitrarily 
small. One can then choose r small enough to make the second term arbitrarily small 
as well. This shows that (3.11) holds and therefore the proof of Lemma 3.8 is complete. 

□ 

3.3 Definition of the SDS 

The results shown so far in this section are sufficient to construct the required SDS. We 
start by considering the pathwise solutions to (SDE). Given a time T > 0, an initial 
condition x € R", and a noise b S Co([0, T], R"), we look for a function ^t{x, h) e 
C([0,r],R") satisfying 

$t(x, &)(t) = CT&(t) + X + / f{^T{xMs))ds . (3.13) 
We have the following standard result: 

Lemma 3.9 Let f : R" —^ R" satisfy assumptions Al and A2. Then, there exists a 
unique map $t '■ R" x C([0, T], R") — > C([0, T], R") satisfying (3.13). Furthermore, 
4>T is locally Lipschitz continuous. 

Proof. The local {i.e. small T) existence and uniqueness of continuous solutions to 
(3.13) follows from a standard contraction argument. In order to show the global ex- 
istence and the local Lipschitz property, fix x, b, an T, and define y{t) = x + ab(t). 
Define z{t) as the solution to the differential equation 

z{t) = f(z{t) + y{t)), z(0)-0. (3.14) 

Writing down the differential equation satisfied by ||z(t)||^ and using Al and A2, 
one sees that (3.14) possesses a (unique) solution up to time T. One can then set 
4>T(a;,6)(i) = z{t) + y{t)wd check that it satisfies (3.13). The local Lipschitz property 
of $T then immediately follows from the local Lipschitz property of /. □ 

We now define the stationary noise process. For this, we define 6t : TiH "Hh by 

(6»tw)(s) = w(s - t) - w(-t) . 

In order to construct the transition semigroup Vt, we define first Hh like Hh, but with 
arguments in R+ instead of R , and we write W for the Wiener measure on Hh, as 
constructed in Lemma 3.8 above. Define the function Pt : Hh x Hh ^ Hh by 

, , , ( w(t + s) - w(t) fors>-i, /nicN 

(Ft(w,w)){s) ^ < , ^ ^ , (3.15) 

y '^^ ' ' I + s) - w(t) for s < -t, ^ ' 
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and set Vtiw, ■ ) = Pt(w, • )*W. This construction can be visualised by the following 
picture: 



One then has the following. 

Lemma 3.10 The quadruple {7't}t>o, W, {6't}i>o) is a stationary noise pro- 

cess. 

Proof. We already know from Lemma 3.5 that T-Lh is Polish. Furthermore, one has 
9t o Pt(w,-) ^ w, so it remains to show that Vt is a Feller transition semigroup with 
W as its unique invariant measure. It is straightforward to check that it is a transition 
semigroup and the Feller property follows from the continuity of Ptiw, w) with respect 
to w. By the definition (3.15) and the time-reversal invariance of the Wiener process, 
every invariant measure for {7't}t>o must have its finite-dimensional distributions co- 
incide with those of W. Since the Borel cr-field on Hh is generated by the evaluation 
functions, this shows that W is the only invariant measure. □ 

We now construct a SDS over n copies of the above noise process. With a slight 
abuse of notation, we denote that noise process by (W, {'Pt}t>Oj VV, {&f }t>o). We 
define the (continuous) shift operator Rt : C((-(X), 0],R") Co([0,r],R") by 
{RTb){t) = b{t - T) - b{-T) and set 



From the above results, the following is straightforward: 

Proposition 3.11 The function Lp of (3.16) defines a continuous SDS over the noise 
process {'Pt}t>0: W, {0j}j>o). Furthermore, for every generalised initial con- 
dition fi, the process generated by Lp from fi is a solution to (SDE) in the sense of 
Definition 3.1. 

Proof. The regularity properties of (p have already been shown in Lemma 3.9. The co- 
cycle property is an immediate consequence of the composition property for solutions 
of ODEs. The fact that the processes generated by ip are solutions to (SDE) is a direct 
consequence of (3.13), combined with Lemma 3.2, the definition of Vh, and the fact 
that W is the Wiener measure. □ 

To conclude this section, we show that, thanks to the dissipativity condition im- 
posed on the drift term /, the SDS defined above admits any power of the Euclidean 
norm on R" as a Lyapunov function: 




: R+ X R" X W ^ R" 

[t, X, w) i-> (a;, RiVhw) (t) . 



(3.16) 
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Proposition 3.12 Let Lp be the continuous SDS defined above and assume that Al and 
A2 hold. Then, for every p > 2, the map x '—^ \\x\\p is a Lyapunov function for Lp. 

Proof. Fix p>2 and let ^ be an arbitrary generalised initial condition satisfying 

||xr(n*„//)(dx)<cx). 



Let be the continuous SDS associated by Proposition 3.11 to the equation 

dy(t)^ -ydt + ddBnit) . (3.17) 

Notice that both ip and (p are defined over the same stationary noise process. 

We define xt as the process generated by p from /i and yt as the process generated 
by (p from Sq x\N (in other words yo = 0). Since both SDS are defined over the same 
stationary noise process, xt and yt are defined over the same probability space. The 
process yt is obviously Gaussian, and a direct (but lengthy) calculation shows that its 
variance is given by: 

E\\ytf = 2Hti{(Ta*)e-* [ s"^""^ coshit - s) ds , 
Jo 

In particular, one has for all t: 

/>oo 

E\\ytf <2Htr((Ta*) s^^-^e"" ds = r(2i7 + 1) tr(crcr*)= Coo . (3.18) 

Now define zt = Xt ~ yt- The process zt is seen to satisfy the random differential 
equation given by 

= f{zt + Vt) + yt, zn = xo . 
Furthermore, one has the following equation for ||2t|p: 

^Ml^2{ztJ{zt + yt))+2{zt,yt) . 

Using A2-A3 and the Cauchy-Schwarz inequality, we can estimate the right-hand side 
of this expression by: 

< 2Ct' - 2Cf ||z,f + 2{zt,yt + fiyt)) < -2Cf Hz^f + (7(1 + \\yt\\Y , 

(3.19) 

for some constant C. Therefore 

\\ztf<e-'^i''\\xof + C f\~'^'^'^'-^\l + \\y,\\Yds. 



Jo 

It follows immediately from (3.18) and the fact that ys is Gaussian with bounded co- 
variance (3.18) that there exists a constant Cp such that 

E||ztr <Cpe-f^"*E||xor + Cp, 

foralltimest > 0. Therefore (2.8) holds and the proof of Proposition 3.12 is complete. 

□ 
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4 Coupling construction 

We do now have the necessary formalism to study the long-time behaviour of the SDS 
(/3 we constructed from (SDE). The main tool that will allow us to do that is the notion 
of self-coupling for stochastic dynamical systems. 

4.1 Self-coupling of SDS 

The main goal of this paper is to show that the asymptotic behaviour of the solutions of 
(SDE) does not depend on its initial condition. This will then imply that the dynamics 
converges to a stationary state (in a suitable sense). We therefore look for a suitable 
way of comparing solutions to (SDE). In general, two solutions starting from different 
initial points in R" and driven with the same realisation of the noise Bh have no reason 
of getting close to each other as time goes by. Condition Al indeed only ensures that 
they will tend to approach each other as long as they are sufficiently far apart. This 
is reasonable, since by comparing only solutions driven by the same realisation of the 
noise process, one completely forgets about the randomness of the system and the 
"blurring" this randomness induces. 

It is therefore important to compare probability measures (for example on path- 
space) induced by the solutions rather than the solution themselves. More precisely, 
given a SDS Lp and two generalised initial conditions fj, and u, we want to compare 
the measures QQtfJ. and QQti' as t goes to infinity. The distance we will work with 
is the total variation distance, henceforth denoted by j| • jlxy. We will actually use the 
following useful representation of the total variation distance. Let be a measurable 
space and let Pi and P2 be two probability measures on fl. We denote by C(Pi , P2) the 
set of all probability measures on x f2 which are such that their marginals on the two 
components are equal to Pi and P2 respectively. Let furthermore A C fl x fl denote 
the diagonal, i.e. the set of elements of the form (w, w). We then have 

||Pi-P2i|TV = 2- sup 2P(A). (4.1) 
PeC(Pi,P2) 

Elements of C(Pi, P2) will be referred to as couplings between Pi and P2. This leads 
naturally to the following definition; 

Definition 4.1 Let ip be a SDS with state space X and let M,^ be the associated space 
of generalised initial conditions. A self-coupling /or ip is a measurable map (fi, u) 1-^ 
Q(/Lt, v) from x M^^ into P(R+, X) x P(R+, X), with the property that for every 
pair (fi, v), QifJ., v) is a coupling for Q/i and C^v. 

Define the shift map St : X'(R+, A") X'(R+, A") by 

(Stx)(s) = x{t + s) . 

It follows immediately from the cocycle property and the stationarity of the noise pro- 
cess that QQtH = Q/i. Therefore, the measure Q(/t, 1/) is a coupling for QQtH 
and QQti^ (which is in general different from the coupling QiQtH, Qtv)). Our aim in 
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the remainder of this paper is to construct a self-coupHng Q(/i, for the SDS associ- 
ated to (SDE) which has the property that 

lim(E:Q(Ai,z.))(A) = l, 

t — S-OO 

where A denotes as before the diagonal of the space I?(R+, X) x I?(R+, A"). We will 
then use the inequality 

WQQtfi - QQtiyhv < 2 - 2(S:Q(Ai, , (4.2) 

to deduce the uniqueness of the stationary state for (SDE). 

In the remainder of the paper, the general way of constructing such a self-coupling 
will be the following. First, we fix a Polish space A that contains some auxiliary 
information on the dynamics of the coupled process we want to keep track of. We also 
define a "future" noise space >V+ to be equal to H^, where Hh is as in (3.15). There 
is a natural continuous time-shift operator on R x W x W+ defined for t > by 

(s, w, zD) i-^ (s — i, Pt{w, w), Stw) , {St'w)(r) = w(r + t)-~ w(t) , (4.3) 

where Pt was defined in (3.15). We then construct a (measurable) map 

CiA'^xW^x^^Rx Mi(A X W? ) , 

+ (4.4) 

{x,y,Wx,Wy,a) {T{x,y,w^,Wy,a),\N2{x,y,Wx,Wy,a)) , 

with the properties that, for all {x, y, WxjWy, a), 

(CI) The time T{x, y, WxjWy, a) is positive and greater than 1. 

(C2) The marginals of W2(a:, y, WxjWy, a) onto the two copies of W+ are both equal 
to the Wiener measure W. 

We call the map C the "coupling map", since it yields a natural way of constructing a 
self-coupling for the SDS Lp. The remainder of this subsection explains how to achieve 
this. 

Given the map C, we can construct a Markov process on the augmented space 
X = X X R-|_ X ^ X in the following way. As long as the component 
T S R+ is positive, we just time-shift the elements in x x R-|_ according to 
(4.3) and we evolve in by solving (SDE). As soon as r becomes 0, we redraw the 
future of the noise up to time T(x, y, a) according to the distribution W2, which may 
at the same time modify the information stored in A. 

To shorten notations, we denote elements of X by 

X = (x,y,Wx,Wy,T,a,Wx,Wy) . 

With this notation, the transition function Qt for the process we just described is de- 
fined by; 

• For < < r, we define QtiX; ■ ) by 

X 5p^(wy,wy) X Sr-t X Sa X Ss^w^ X SstWy ■ 
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• For t = T,we define Qt(X; ■ ) by 

X Sp^(w^^wy) X 5T{x,y,Pt{W:,,w^),Pt{Wy,Wy).a) (4-5) 

X \N2(x,y,Pt{Wx,Wx),Pt{'Wy,Wy),a) . 

• For t > T,we define Qt by imposing that the Chapman-Kolmogorov equations 
hold. Since we assumed that T{x, y, Wx,Wy, a) is always greater than 1, this 
procedure is well-defined. 

We now construct an initial condition for this process, given two generalised initial 
conditions i^ii and 1^12 for (p. We do this in such a way that, in the beginning, the noise 
component of our process lives on the diagonal of the space W^. In other words, the 
two copies of the two-sided fBm driving our coupled system have the same past. This is 
possible since the marginals of fii and fi2 on W coincide. Concerning the components 
of the initial condition in R_|- x ^ x W^, we just draw them according to the map C, 
with some distinguished element ao G A. 

We call Qo(M1i M2) the measure on X constructed by this procedure. Consider a 
cylindrical subset of X of the form 

X = Xi X X2 X Wi X W2 X F , 

where is a measurable subset of x ^ x W^. We make use of the disintegration 
w 1-^ fi^, yielding formally fj,i{dx , dw) — iJ,f'{dx)\N{dw), and we define Qo(m1'M2) 
by 



Qo(y"l,y"2)(^) = 




,ao) X \N2iXl,X2,W,'W,ao)){F) 



^^(dx2) fi'l'{dxi)\N{dw) . (4.6) 

With this definition, we finally construct the self-coupling Q(/ii, 1^2) of (p correspond- 
ing to the function C as the marginal on C(R+ , X) x C(R+ , X) of the process generated 
by the initial condition Qo(M1j M2) evolving under the semigroup given by Qt- Condi- 
tion (C2) ensures that this is indeed a coupling for Q/ii and Q/i2. 

The following subsection gives an overview of the way the coupling function C is 
constructed. 

4.2 Construction of the coupling function 

Let us consider that the initial conditions /ii and /i2 are fixed once and for all and 
denote by xt and yt the two A'-valued processes obtained by considering the marginals 
of Q(/ii, /i2) on its two X components. Define the random (but not stopping) time Too 
by 

Too = infjt > Q\xs = Vs for all s > i} . 

Our aim is to find a space A and a function C satisfying (CI) and (C2) such that 
the processes xt and yt eventually meet and stay together for all times, i.e. such that 
limT-too P('''oo < T) = 1. If the noise process driving the system was Markov, the 
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"stay together" part of this statement would not be a problem, since it would suffice 
to start driving xt and yt with identical realisations of the noise as soon as they meet. 
Since the fBm is not Markov, it is possible to make the future realisations of two copies 
coincide with probability 1 only if the past realisations also coincide. If the past reali- 
sations do not coincide for some time, we interpret this as introducing a "cost" into the 
system, which we need to master (this notion of cost will be made precise in Defini- 
tion 5.3 below). Fortunately, the memory of past events becomes smaller and smaller 
as time goes by, which can be interpreted as a natural tendency of the cost to decrease. 
This way of interpreting our system leads to the following algorithm that should be 
implemented by the coupling function C. 



Try to make xt and yt meet 

















Try to keep xt and yt together 




failure 




















Wait until the cost is low 



The precise meaning of the statements appearing in this diagram will be made clear 
in the sequel, but the general idea of the construction should be clear by now. One 
step in (4.7) corresponds to the time between two jumps of the r-component of the 
coupled process. Our aim is to construct the coupling function C in such a way that, 
with probability 1, there is a time after which step 2 always succeeds. This time is then 
precisely the random time Tqo we want to estimate. 

It is clear from what has just been exposed that we will actually never need to con- 
sider the continuous-time process on the space X given by the self-coupling described 
in the previous section, but it is sufficient to describe what happens at the beginning of 
each step in (4.7). We will therefore only consider the discrete-time dynamic obtained 
by sampling the continuous-time system just before each step. The discrete-time dy- 
namic will take place on the space Z = (A"^ x >V^ x ^) x R-|- and we will denote its 
elements by 

(Z, r), Z ^ {x,y,Wx,Wy,a) , r G R+ . 

Since the time steps of the discrete dynamic are not equally spaced, the time r is 
required to keep track of how much time really elapsed. The dynamic of the discrete 
process (Z„, r„) on Z is determined by the function $ : R+ x Z x (^ x W^) Z 
given by 

<^{t,{Z,T),{Wx,Wy,a)) = {(pt(X, Pt(W:,,W^)), (fitiy, Pt(Wy,Wy)), 
Pt{Wx,Wx), Pt(Wy,Wy),d,T + t) . 
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(The notations are the same as in the definition of Qt above.) With this definition at 
hand, the transition function for the process (Z„, t„) is given by 

F(Z, r) = $(r(Z), (Z, r), •)*W2(Z) , (4.8) 

where T and W2 are defined in (4.4). Given two generaUsed initial conditions /ii and 
^2 for the original SDS, the initial condition {Zq, tq) is constructed by choosing tq — 
and by drawing Zq according to the measure 



JWinW2 JXi JX2 

where X is a cylindrical set of the form X = Xi x X2 x Wi x W2 x A. It follows from 
the definitions (4.5) and (4.6) that if we define t„ as the nth jump of the process on X 
constructed above and Z„ as (the component in x x AoT) its left-hand limit at 
r„, the process we obtain is equal in law to the Markov chain that we just constructed. 

Before carrying further on with the construction of C, we make a few preliminary 
computations to see how changes in the past of the fBm affect its future. The formulae 
and estimates obtained in the next subsection are crucial for the construction of C and 
for the obtention of the bounds that lead to Theorems 1.2 and 1.3. In particular. Propo- 
sition 4.4 is the main estimate that leads to the coherence of the coupling construction 
and to the bounds on the convergence rate towards the stationary state. 

4.3 Influence of the past on the future 

Let Wx G Hh and set Bx = VhWx- Consider furthermore two functions g^, and gs 
satisfying 

I gn^(s)dsenH, f gB{s)dseHi-H , (4.9) 

Jo Jo 

and define By and Wy by By{0) = Wy{0) = and 

dBy = dBx + gs dt , dwy = dwx + gw dt . (4.10) 

As an immediate consequence of the definition of Vh, the following relations between 
g-uj and gs will ensure that By = DnWy. 

Lemma 4.2 Let Bx, By, Wx, Wy, gs, and g„, be as in (4.9), (4.10) and assume that 
Bx — DhWx and By = 'DuWy- Then, g^ and gB satisfy the following relation: 

d /"* 1 

9w(t) = aH-r^l (t- s)i-"gB{s)ds , (4.11a) 
dt J ^00 

d „ 1 

gsit) ^ Inai-Hj^ J (t-s)" ^g^o{s)ds. (4.11b) 

U 9w(t) ~ Ofort > <o. one has 

gsit) ^ {H - \)-iHai-H f (t-s)"-ig^{s)ds, (4.11c) 

J —00 
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for t > to. Similarly, if gsit) = 0/or t > to, one has 



9w{t) - (i - H)aH / (t-s) 



gsis) ds , 

for t > to- If gw is differentiable for t > to and gwit) — 0/or t < to, one has 



9B(t) = — , + iHOil-H 



it -to) 2 



■ ds . 



(4. lid) 



(4.1 le) 



for t > to- Similarly, if gs is differentiable for t > to and gsit) ~ for t < to, one 
has 



g-wit) 



it-to)"-i 



OiH 



to it - S)"- 



■ ds , 



(4.1 If) 



fort > to. 



Proof. The claims (4.11a) and (4.11b) follow immediately from (4.10), using the lin- 
earity of Vh and the inversion formula. The other claims are simply obtained by 
differentiating under the integral, see [SKM93] for a justification. □ 

We will be led in the sequel to consider the following situation, where ti,t2 and gi 
are assumed to be given: 



gwit) 








g2{t - 12) 

\ 

\ 


gsit) 












t = 


t = ti 


t = t2 



(4.12) 



In this picture, g^^ and gs are related by (4. 1 la-4. 1 lb) as before. The boldfaced regions 
indicate that we consider the corresponding parts of g^ or gB to be given. The dashed 
regions indicate that those parts of g^j and are computed from the boldfaced regions 
by using the relations (4. 1 la-4. 1 lb). The picture is coherent since the formulae (4. 11a- 
4.1 lb) in both cases only use information about the past to compute the present. One 
should think of the interval [0, ii] as representing the time spent on steps 1 and 2 of the 
algorithm (4.7). The interval [ti, t2] corresponds to the waiting time, i.e. step 3. Let us 
first give an explicit formula for 52 in terms of gi : 



Lemma 4.3 Consider the situation of Proposition 4.4. Then, 52 is given by 



g2{t) - C 



t2 -"(t2 - sr-2 

t + t2-S 



ffi(s) ds , 



(4.13) 



with a constant C depending only on H. 



Coupling construction 



28 



Proof. We extend gi{t) to the whole real line by setting it equal to outside of [0, ti\. 
Using Lemma 4.2, we see that, for some constant C and for t > t2, 

Jo 

/•t-2 1 d 1 

= C / (t-s)-"-i— / {s-r)"-igi(r)drds 
Jo ds Jo 

= C{t-h)-"--- I\t2-r)"~--gi{r)dr 
Jo 



Jo Jo 



{s — r) ■2gi(r)drds 



C / K{t,r)gi(r)dr , 



where the integration stops at ti because gi is equal to for larger values of t. The 
kernel K is given by 

J r 



(t - t2y"-^-(t2 - r)"+^ f t-r 



- 1 



t — r \t2 — r 

^ (t~t2)i-"(t2~r)"-i 
t-r 

and the claim follows. □ 

We give now estimates on 92 in terms of gi . To this end, given a > 0, we introduce 
the following norm on functions g : R+ R": 



ii5iia= / (i+tY'^imrdt. 

Jo 

The following proposition is essential to the coherence of our coupling construction: 

Proposition 4.4 Let t2 > 2ti > 0, let gi : [0,ti] R" be a square integrable 
function, and define 32 '■ R+ — > R+ by 

Then, for every a satisfying 

< a < min{i ; H} , 
there exists a constant n > depending only on a and H such that the estimate 



ll.92i|« < K 

holds. 



a— 7 



1,91 1 U (4.14) 
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Remark 4.5 The important features of this proposition are that the constant k does not 
depend on ti or 1-2 and that the exponent in (4. 14) is negative. 

Proof. We define r = t2/^i to shorten notations. Using (4.13) and Cauchy-Schwarz, 
we then have 



1,92(^)11 < C|l.gi|U 




[t + rti-tisf 

where we made use of the assumptions that 2a < 1 and r > 2. Therefore, ||(72|la 
bounded by 



(t + (r - l)ti) 




for some constant k, where the last inequality was obtained through the change of 
variables i 1— > (r — l)tit and used the fact that r > 2. The convergence of the integral 
is obtained under the condition a < H which is verified by assumption, so the proof 
of Proposition 4.4 is complete. □ 

We will construct our coupling function C in such a way that there always exist 
functions and qb satisfying (4.9) and (4.10), where Wx and Wy denote the noise 
components of our coupling process, and B^. and By are obtained by applying the 
operator Vh to them. We have now all the necessary ingredients for the construction 
of C. 



5 Definition of the coupling function 

Our coupling construction depends on a parameter a < min{i, H} which we fix once 
and for all. This parameter will then be tuned in Section 6. 
First of all, we define the auxiliary space A: 

= {0,1,2,3} X N X N X R+ . (5.1) 

Elements of A will be denoted by 

a = (S,N,N,T3) . (5.2) 



The component S denotes which step of (4.7) is going to be performed next (the value 
will be used only for the initial value ao). The counter N is incremented every time 
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step 2 is performed and is reset to every time another step is performed. The counter 
TV is incremented every time step 1 or step 2 fails. If steps 1 or 2 fail, the time T3 
contains the duration of the upcoming step 3. We take 

ao = (0,l,l,0) 

as initial condition for our coupling construction. 

Remember that the coupling function C is a function from x x A, repre- 
senting the state of the system at the end of a step, into R x Mi(yl x W^), representing 
the duration and the realisation of the noise for the next step. We now define C for the 
four possible values of 5*. 

5.1 Initial stage (5 = 0) 

Notice first that Al implies that 

{f(y) " fix), y~x) ^ 

\\y — x\\ 



<Cr-Cr\\y-x\\, (5.3) 



where we set Cf = v^Cf i(Cf + Cf^). 

In the beginning, we just wait until the two copies of our process are within distance 
1 + (C4^/C^^) of each other. If xt and yt satisfy (SDE) with the same realisation of 
the noise process Bh, and Qt — yt — Xt, we have by for \\gt\\ the differential inequality 



d\\gt\\ ^ ifiyt) - f(xt), Qt) 
dt \\gt\\ 

and therefore by Gronwall's lemma 



< ct' - c^'Utl 



\\gt\\<\\yo-xo\\e-^"' + %{l-e-^i'*). 

It is enough to wait for a time t = (log ||?/o ^ 2;o||)/C2' to ensure that \\gt\\ < 1 + 
(C^*/C^*), so we define the coupling function C in this case by 

TiZ, ao) = max{i^^^^ , l} , \N2iZ, ao) = A*W x 6a' , (5.4) 

where the map A : W+ is defined by A(w) = (w, w) and the element a' is 

given by 

a = (1,0,0,0) . 

In other terms, we wait until the two copies of the process are close to each other, and 
then we proceed to step 1. 

5.2 Waiting stage (5 = 3) 

In this stage, both copies evolve with the same realisation of the underlying Wiener 
process. Using notations (5.2) and (4.4), we therefore define the coupling function C 
in this case by 

T{Z, a) = n, W2(Z, a) = A*W x Sa' , (5.5) 
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where the map A is defined as above and the element a' is given by 

a = (1,7V,7V,0) . 

Notice that this definition is in accordance with (4.7), i.e. the counters N and remain 
unchanged, the dynamic evolves for a time T3 with two identical realisations of the 
Wiener process (note that the realisations of the fBm driving the two copies of the 
system are in general different, since the pasts of the Wiener processes may differ), 
and then proceeds to step 1. 

5.3 Hitting stage (S = 7) 

In this section, we construct and then analyse the map C corresponding to the step 1, 
which is the most important step for our construction. We start with a few preliminary 
computations. Define W\^\ as the space of almost everywhere differentiable functions 
g, such that the quantity 



/■ 


dgsit) 


/o 


dt 



dt + 1 1.9(0)1 



is finite. 

Lemma 5.1 Let gs ■ [0, 1] R" be in Wi^i and define g^ by (4.11a) with H S 
(i, 1). (The fimction gs is extended to R by setting it equal to outside of [0, 1] and 
gw is considered as a function from R+ to R".j Then, for every a € (0, H), there exists 
a constant C such that 

1 1 9w 1 1 Q < C||.gB||i,i . 

Proof. We first bound the \3 norm of g^, on the interval [0,2]. Using (4.1 If), we can 
bound \\gw{t)\\ by 

rt 

\\gn,(t)\\ < C||<7B(0)||t^-^ + C / UBiMit - s)^~" ds . 

Jo 

Since ti^^ is square integrable at the origin, it remains to bound the terms Ii and I2 
given by 




/2 = 11,95(0)11 / ti-" (t~s)i~"\\gB(s)\\dsdt, 
Jo Jo 

We only show how to bound /i, as I2 can be bounded in a similar fashion. Writing 

rVs — maxj?', s} one has 

Ii= f [ [ (t^s)i-"(t-r)^'"dt\\gB(s)\\\\gB(r)\\drds . 

Jo Jo JrWs 
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Since 

/ (t~s)^'"(t-r)i""dt< / {t - (r \/ s))^-^" dt < 

J rVs J rVs 



2-277' 

Ii is bounded by C||(7b||^ 

It remains to bound the large-time tail of g„,. For t > 2, one has, again by 
Lemma 4.2, 

UM<it-l)-"-^ sup \\gB(s)\\<C(t^ir"-^\\gB\\i.i ■ (5.6) 
se[o,i] 

It follows from the definition that the j | • j | „ -norm of this function is bounded if a < H. 
The proof of Lemma 5. 1 is complete. □ 

In the case H < ^, one has a similar result, but the regularity of gs can be weak- 
ened. 

Lemma 5.2 Let gs '■ [0, 1] R" be a continuous function and define g^ as in 
Lemma 5.1, but with H G (0, Then, for every a £ (0, H), there exists a constant C 
such that 

\\gw\\a < C sup 11.95(011 . 

te[0,i] 

Proof. Since H < ^, one can move the derivative under the integral of the first equa- 
tion in Lemma 4.2 to get 

Il5-(t)ll <C f it- s)-"--2\\gB(s)\\ ds < C sup WgBim . 
Jo te[o,i] 

This shows that the restriction of to [0, 2] is square integrable. The large-time tail 
can be bounded by (5.6) as before. □ 

We already hinted several times towards the notion of a "cost function" that mea- 
sures the difficulty of coupling the two copies of the process. This notion is now made 
precise. Denote by Z = (xq, yo,Wx, Wy) an element of x and assume that there 
exists a square integrable function g^ : R- R" such that 

wy(t) ^ wAt) + gUs)ds, Vt<0. (5.7) 
In regard of (4.13), we introduce for T > the operator TZt given by 

(7^T5)(^) = C / ^ \\g(s)\\ ds , 

./_^ t + T-s 



where C is the constant appearing in (4.13). The cost is then defined as follows. 
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Definition 5.3 The cost function K-a '■ L (R_) [0, oo] is defined by 

ICa^ig)^ sup WUrgh + CK (-s)"-i\\g(s)\\ ds , (5.8) 

T>0 J-oo 

where, for convenience, we define Ck = |(2-ff — l)7//ai_//|. Given Z as above, 
lCa{Z) is defined as ICa(gw) if there exists a square integrable function g^j satisfying 
( 5. 7) and as oo otherwise. 

Remark 5.4 The cost function ICa defined above has the important property that 

ICaiOtg) < ICa(g) , for all t > 0, (5.9) 
where the shifted function 9tg is given by 

Furthermore, it is a norm, and thus satisfies the triangle inequality. 

Remark 5.5 By (4.13), the first term in (5.8) measures by how much the two realisa- 
tions of the Wiener process have to differ in order to obtain identical increments for the 
associated fractional Brownian motions. By (4.1 Ic), the second term in (5.8) measures 
by how much the two realisations of the fBm differ if one lets the system evolve with 
two identical realisations of the Wiener process. 

We now turn to the construction of the process {xt,yt) during step 1. We will set 
up our coupling construction in such a way that, whenever step 1 is to be performed, 
the initial condition Z is admissible in the following sense: 

Definition 5.6 Let a satisfy < a < min{i; H}. We say that Z = (xq, uq^w^, Wy) is 
admissible if one has 

1 + C^^ 

||a;o-yo|| < 1 + ^1^, (5.10) 

(the constants C^^ are as inAl and in (5.3)), and its cost satisfies JCa{Z) < 1. 

Denote now by fJ the space of continuous functions uj : [0, 1] R" which are the 
restriction to [0, 1] of an element of Hh- Our aim is construct two measures and 
on X satisfying the following conditions: 

Bl The marginals of P^ + P| onto the two components f2 of the product space are 
both equal to the Wiener measure W. 

B2 Let Bk C 51 X denote the set of pairs (Wx , Wy) such that there exists a function 
5™ : [0, 1] R" satisfying 

t rl 

2 



Wy(t) = Wx(t) + / g.uj(s)ds, / ||5^(s)|| ds < k 



Then, there exists a value of k such that, for every admissible initial condition 
Zo, we have P^(B J + P|(B„) = 1. 
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B3 Let {xt,yt) be the process constructed by solving (SDE) with respective ini- 
tial conditions and yo, and with respective noise processes Pt{Wx,Wx) and 
PtiuUy, Wy). Then, one has xi = yi for P^-almost every noise (wx, Wy). Fur- 
thermore, there exists a constant S > such that P^i^ x ^l) > S for every 
admissible initial condition Z. 

Remark 5.7 Both measures and can easily be extended to measures on in 
such a way that Bl holds. Since the dynamic constructed from the coupling function 
C will not depend on this extension, we just choose one arbitrarily and denote again by 
P^ and P^ the corresponding measures on . 

Given P^ and P^, we construct the coupling function C in the following way, using 
notations (5.2) and (4.4): 

T(Z,a)=l, \N2{Z,a)^P],xSa,+Plx6a,, (5.11) 

where the two elements ai and 02 are defined as 

ai = (2,0,iV,0) , (5.12a) 
a2 = (3, 0, iV + 1, t*7V^/<i-2">) , (5.12b) 

for some constant to be determined later in this section. Notice that this definition 
reflects the algorithm (4.7) and the explanation following (5.2). The reason behind the 
particular choice of the waiting time in (5. 12b) will become clear in Remark 5.11. 

The way the construction of P^ and P^ works is very close to the binding con- 
struction in [Hai02]. The main difference is that the construction presented in [Hai02] 
doesn't allow to satisfy B2 above. We will therefore introduce a symmetrised version 
of the binding construction that allows to gain a better control over g-u,. If fii and fi2 
are two positive measures with densities Di and D2 with respect to some common 
measure fi, we define the measure fii A fi2 by 

(Mi ^ i^2){dw) = mm{Di{w), D2{w)} n(dw) . 

The key ingredient for the construction of P^ and P| is the following lemma, the proof 
of which will be given later in this section. 

Lemma 5.8 Let Z ~ (xq, yo, Wx, Wy) be an admissible initial condition and let H, 
a, and f satisfy the hypotheses of either Theorem 1.2 or Theorem 1.3. Then, there 
exists a measurable map z ; f2 — > f2 with measurable inverse, having the following 
properties. 

Bl' There exists a constant i5 > such that W A ^'^W has mass bigger than 26 for 
every admissible initial condition Z. 

B2' There exists a constant k such that {(Wx, Wy) \ Wy = ziwx)} C H^for every 
admissible initial condition Z. 

B3' Let (xt, yt) be the process constructed by solving (SDE) with respective initial 
conditions Xq and yo, and with noise processes PtiWx , Wx) ond PtiWy , ziwx))- 
Then, one has xi ~ yi for every Wx & ^ and every admissible initial condition 
Z. 
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Furthermore, the maps ^ z <^nd ^ ^ are measurable with respect to Z. 

Given such a ^P^, we first define the maps and from to x SI by 

(See also Figure 1 below.) We also define the "switch map" S : Q. x Q, VL x Vlhy 

S{Wx,Wy) = {Wy.Wx). 




Figure 1: Construction of P^. 
With these definitions at hand, we construct two measures P^ and P^ on SI x SI by 

Pz = ^(*tW A^-^W) , P^=P^ + S'*P^. (5.13) 

On Figure 1, P^ lives on the boldfaced curve and V\ is its symmetrised version which 
lives on both the boldfaced and the dashed curve. Denote by 11; : SI x SI ^ S7 the 
projectors onto the ith component and by A : S7 ^ S7 x SI the lift onto the diagonal 
A(w) — (w, w). Then, we define the measure P| by 

P| = S*Pl + A* (W - n^P^) . (5.14) 

By (5.13), W > IIiP^, so P^ and P| are both positive and their sum is a probability 
measure. Furthermore, one has by definition 

Pz + P| = Pz + - n*p^) . 

Since 11^ A* is the identity, this immediately implies 

n^p^ + n^Pl = w . 

The symmetry S*Pz = P^ then implies that the second marginal is also equal to W, 
i.e. Bl is satisfied. Furthermore, the set {(Wx, Wy) \ Wy = "i/ziwx)} has P^-measure 
bigger than 6 by Bl', so B3 is satisfied as well. Finally, B2 is an immediate conse- 
quence of B2'. It remains to construct the function '^z- 
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Proof of Lemma 5.8. As previously, we write Z as 

Z ^ (xo,yo,Wx,Wy) . (5.15) 

In order to construct '^z, we proceed as in [Hai02, Sect. 5], except that we want the 
solutions xt and yt to become equal after time 1. Let Wx E ^he given and define 

BhU) - {VhPi{w,, w,))(t - 1) , (5.16) 

where W denotes the corresponding part of the initial condition Zq in (5.15). We write 
the solutions to (SDE) as 

dxt = f(xt) dt + adBnit) , (5.17a) 
dyt = fiyt) dt + adBnit) + agsit) dt , (5. 17b) 

where gsit) is a function to be determined. Notice that xt is completely determined by 
Wx and by the initial condition Z. We introduce the process gt ^ yt — Xt, so we get 

^ ^ f(xt + Qt) - f(xt) + ^9B(t) . (5.18) 
dt 

We now define gsit) by 

Ss(t) = -fT"'fKift + K2^m) , (5.19) 
for two constants /ti and K2 to be specified. This yields for the norm of gt the estimate 

^^<2iC^'~n,)\\gtr~2n,\\gtrf'. 

We choose ki = Cg ^ and so 

Hp II < J (6'«2t - VWqoWY for t < v /||go|| /(6K2), .3 
\0 fort> v^/(6k2). 

We can then choose K2 sufficiently large, so that || git || = Ofori > 1/2. Since the initial 
condition was admissible by assumption, the constant K2 can be chosen as a function 
of the constants only. Notice also that the preceding construction yields gs as a 
function of Z and only. 

We then construct Wy = 'ifziwx) in such a way that (5.17) is satisfied with the 
function gs we just constructed. Define gu, by (5.7) and construct gs by applying 
(4.11b). Then, we extend gs to (—00, 1] by simply putting it equal to gs on (—00, 0]. 
Applying the inverse formula (4.11a), we obtain a function g^j on (—00, 1], which is 
equal to g„, on (—00, 0] and which is such that 

z(wx)){t) = Wxit) + [ gUs)ds, 



has precisely the required property. 
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It remains to check that the family of maps constructed this way has the prop- 
erties stated in Lemma 5.8. The inverse of is constructed in the following way. 
Choose Wy S O and consider the solution to the equation 

dvt = fiyt)dt + adB'H{t), 

where Bh is defined as in (5.16) with x replaced by y. Once yt is obtained, one can 
construct the process gt as before, but this time by solving 

^ = fiyt) - f(yt - Qt) - (niQt + K2—^=) ■ 

This allows to define gs as in (5.19). The element w^, = ^'^^(it'j,) is then obtained by 
the same procedure as before. 

Before turning to the proof of properties Bl'-B3', we give some estimate on the 
function that we just constructed. 

Lemma 5.9 Assume that the conditions of Lemma 5.8 hold. Then, there exists a con- 
stant K such that the function gw{Z, Wx) constructed above satisfies 

\gn,{Z,Wx){s)\f' ds < K , 

for every admissible initial condition Z and for every G yV+. 
Proof. We write cjwit) for t > as 



a{t) - C 



t2 



H- 



t - s 



-gUs) ds + an 



dt 



{t^s)i-"gB{s)ds, 



9^n,\t) + g'^>it). 



where g^, is defined by (5.7), is given by (5.19), and the constant C is the constant 
appearing in (4.13). The L-^-norm of gj^' is bounded by 1 by the assumption that Z 
is admissible. To bound the norm of g^j^\ we treat the cases H < ^ and H > ^ 
separately. 

The case H < 5. For this case, we simply combine Lemma 5.2 with the definition 
(5.19) and the estimate (5.20). 

The case H > i. For this case, we apply Lemma 5. 1, so we bound the || • || 1 i-norm of 
gB- By (5 . 1 9), one has 



d~ 



< C 



dgt 



dt 



1 



(5.21) 



for some positive constant C. Using (5.18), the assumption about the boundedness of 
the derivative of /, and the definition (5.19) we get 

dgt 



dt 



<C{\\gt\\ + VM\) ■ 



Combining this with (5.21) and (5.20), the required bound on WgsW 1.1 follows. 



□ 
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Property Bl' now follows from Lemma 5.9 and Girsanov's theorem in the follow- 
ing way. Denote by Dz the density of '^*^\N with respect to W, i.e. {^*^\N)(dwx) = 
^z(Wx) y^idwx)- It is given by Girsanov's formula 

Dz(wx) = exp(^y" {{gyj(Z,w.^))(t) , dw^{t)) J \\9w{Z,Wx)\\'^{t) dt^ . 

One can check (see e.g. [Mat02]) that || W A ^'^W||tv is bounded from below by 

IIWA^'^WIItv > Dz(w)"^W(dw)) . 

Property Bl' thus follows immediately from Lemma 5.9, using the fact that 

^ exp(-2 ^ {{g^(Z, i?},))(t) , dw,{t)) - 2 ^ \\g^{Z, w,)f{t) dt) \N{dw) = 1 . 

Property B2' is also an immediate consequence of Lemma 5.9, and property B3' fol- 
lows by construction from (5.20). The proof of Lemma 5.8 is complete. □ 

Before concluding this subsection we show that, if step 1 fails, can be chosen in 
such a way that the waiting time i*^"*^*^^^"^ in (5.12b) is long enough so that (5.10) 
holds again after step 3 and so that the cost function does not increase by more than 
1/{2N^). By the triangle inequality, the second claim follows if we show that 

JCa{Otgw(Z,Wx)) < , (5.22) 

whenever t is large enough (the shift Ot is as in (5.9)). Combining (4.14), Lemma 5.9, 
and the definition of /Cq, we get, for some constant C, 

ICaiOtgUZ,Wx)) < Ct"-i +Ct"'^ , fort > 2. 

There thus exists a constant such that the bound (5.22) is satisfied if the waiting time 
is longer than t»7V^/^^^^"*. It remains to show that (5.10) holds after the waiting time 
is over. If step 1 failed, the realisations Wx and Wy are drawn either in the set 

Al = {{lix, Wy) e fi^ I Wa; = liy} , 

or in the set 

A2 = {(wx, Wy) en'^l'Wx ^ ^z(wy)} 

(see Figure 1). In order to describe the dynamics also during the waiting time (i.e. step 
3), we extend those sets to by 

Ai = {(wx.Wy) e Wl \(wx\io,i],Wy\io,i]) e A; , 
and Wx{t) — Wy{t) = const for i > l} . 
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Given an admissible initial condition Z = {xo,yo, Wx, Wy) and a pair {Wx, Wy) E W^, 
we consider the solutions xt and yt to (SDE) given by 



dxt = f{xt) dt + adBfjit) , 
dyt = f(yt)dt + adBf,(t), 



(5.23) 



where Bfj (and similarly for B^) is constructed as usual by concatenating Wx and Wx 
and applying the operator Vh- The key observation is the following lemma. 

Lemma 5.10 Let Z be an admissible initial condition as above, let (Wx, Wy) € Ai U 
A2, and let xt and yt be given by ( 5.23 )for t > 0. Then, there exists a constant > 
such that 



Wxt-ytW < 1 + 



^2 

holds again for t > f*. 

Proof. Fix an admissible initial condition Z and consider the case when {wx , Wy) £ A2 
first. Let gw : R„ R" be as in (5.7) and define 5^, : R+ R" by 

Wy(t) = Wx{t) + / g^is) ds . 
Jo 

Introducing Qt — yt — Xt, we see that it satisfies the equation 

^ = fiyt) - fixt) + ^Gt , (5.24) 

dt 

where the function Gt is given by 

i-O ^ i-t 

Gt = ci (t~s)"-^gUs)ds + C2— {t~s)"-igUs)ds, (5.25) 
J- 00 dt Jq 

with some constants ci and C2 depending only on H. It follows from (5.24), (5.3), and 
Gronwall's lemma, that the Euchdeannorm \\Qt\\ satisfies the inequality 



\\Qt\\ < e-^^ '\\go\\ + / e-^^ <*-^)(Cf + \\Gs\\)ds . (5.26) 
Jo 

Consider first the time interval [0, 1] and define 

rO ^ d 1 

Gt^ci (t- s)"~ig^{s)ds - C2— j (t - s)"-i gyj(s) ds , 
J-00 Jo 

i.e., we simply reversed the sign of g^,. This corresponds to the case where (WxjWy) 
are interchanged, and thus satisfy Wy = "^/ziwx) instead of Wx ~ '^zi'Wy)- We thus 
deduce from (5.19) and (5.20) that 



WGsW < lk-i(«:4||f?o|| +'*2v/M) > (5.27) 
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for s G [0, 1]. This yields for \\Gs\\ the estimate 



l^sll < |k-l||(Kl||^)o|| +K2^M) +2ci / {t~s)"-^\\gUs)\\ds 



<h-^\\{Ki\\go\\+K2^U^\)+l, (5.28) 

where we used the fact that Z is admissible for the second step. Notice that (5.28) only 
holds for s S [0, 1], so we consider now the case s > 1. In this case, we can write Qt 
as 

Gt = ci (t - s)" 2 g.ui{s) ds + ci {t- s)" 2 g^,(s) ds . 







The first term is bounded by 1 as before. In order to bound the second term, we use 
Lemma 5.9, so we get 



This function has a singularity at t = 1, but this singularity is always integrable. For 
t > 2 say, it behaves like t"~i. Putting the estimates (5.28) and (5.29) into (5.26), 
we see that there exists a constant C depending only on H and on the parameters in 
assumption Al such that, for t > 2, one has the estimate 

The claim follows at once. □ 

Remark 5.11 To summarise, we have shown the following in this section: 

1 . There exists a positive constant S such that if the state Z of the coupled system 
is admissible, step 1 has a probability larger than 6 to succeed. 

2. If step 1 fails and the waiting time for step 3 is chosen larger than tt:N^/'-^~^°'\ 
then the state of the coupled system is again admissible after the end of step 3, 
provided the cost /Cq(Z) at the beginning of step 1 was smaller than 1 — 

3. The increase in the cost given between the beginning of step 1 and the end of 
step 3 is smaller than . 

In the following subsection, we will define step 2 and so conclude the construction 
and the analysis of the coupling function C. 

5,4 Coupling stage (S = 2) 

In this subsection, we construct and analyse the coupling map C corresponding to step 
2. Following (4.7), we construct it in such a way that, with positive probability, the 
two copies of the process (SDE) are driven with the same noise. In other terms, if 
Z = (xo, yo,Wx, Wy) denotes the state of our coupled system at the beginning of step 
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2, we construct a measure Pz on such that if (wxjWy) is drawn according to Pz, 
then one has 

(VH(WxUWx))(t)=^{VH(WyUWy))(t) , t>0, (5.30) 

with positive probability. Here, U denotes the concatenation operator given by 



{w U w)(t) 



w{t) for t < 0, 
w{t) for t > 0. 



In the notation (5.2), step 2 will have a duration 2^ and N will be incremented by 1 
every time step 2 succeeds. 

The construction of Pz will be similar in spirit to the construction of the previous 
section. We therefore introduce as before the function given by 

Wy{t) = w^^it) + I g^{s)ds . (5.31) 
Jo 

Our main concern is of course to get good bounds on this function gw This is achieved 
by the following lemma, which is crucial in the process of showing that step 2 will 
eventually succeed infinitely often. 

Lemma 5.12 Let Zq be an admissible initial condition and denote by T the measure 
on X obtained by evolving Zq according to the successful realisation of step 
1. Then, there exists a constant K > depending only on H, a, and the parameters 
appearing in Al, such that for T -almost every Z = (x, y, Wx, Wy), and for every pair 
{Wx, Wy) satisfying (5.30), we have the bounds 



\\gw\\a < K , 



dg-u 



dt 

Furthermore, one has x ^ y, T -almost surely. 



< K . (5.32) 

a+l 



Proof. It is clear from Lemma 5.8 that x = y. Let now Z be an element drawn accord- 
ing to T and denote by g^ : R- R" the function formally defined by 

dwy(t) ~ dwx(t) + gw{t) dt . (5.33) 

We also denote by : R_ R" the function such that 

dByit) = dBx(t) + gb(t) dt , (5.34) 

where Bx ~ VhWx and By — VHWy. (Note that g^ and g^ are almost surely well- 
defined, so we discard elements Z for which they can not be defined.) Since Z cor- 
responds almost surely to a successful realization of step 1, gi, is equal on the interval 
[— 1, 0] (up to translation in time) to the function cjB constructed in (5.19). By (5.20), 
there exists therefore a constant Cg such that 
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Combining the linearity of Vh with (4.13), one can see that if (wx, liy) satisfy (5.30), 
then the function is given by the formula 



\t + i\ 



-Hi 



1 



t 



-gUs)ds + 0-2 



-1/2 



(t-s) 



-H-i 



gb(s) ds . 



(5.36) 

for some constants Ci and C2 depending only on H. Notice that the second integral 
only goes up to 1/2 because of (5.35). 

Since the initial condition Zq is admissible by assumption, the || • H^-norm of the 
first term is bounded by 1. The j| • jja-norm of the second term is also bounded by a 
constant, using (5.35) and the assumption a < H. 

Deriving (5.36) with respect to t, we see that there exists a constant K such that 



dgwit) 



dt 



< 



K 



\t + l\i-"\s 



1 



\gw(s)\\ ds 



/-1/2 ^ . 

+ J ^ (t-s)-"-^\\ghis)\\ds) , 



(5.37) 



and the bound on the derivative follows as previously. □ 
The definition of our coupling function will be based on the following lemma: 

Lemma 5.13 Let M be the normal distribution on R, choose a G R, 6 > \a\, and 
define M = max{46, 21og(8/6)}. Then, there exists a measure M"^^ on R^ satisfying 
the following properties: 

1. Both marginals of M"^ f, are equal to M . 

2. If\h\< 1, one has 

Ml^b{{(x,y)\y^x^a\) > 1-6. 

Furthermore, the above quantity is always positive. 

3. One has 

J^,i,{{(x,y)\\y-x\<M}) = 1. 
Proof. Consider the following picture: 

y 
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Denote by Afx the normal distribution on the set = {{x, y)\y = 0} and by My 
the normal distribution on the set Ly ~ {{x,y) | a; = 0}. We also define the maps 
TTi^x (respect, tt^ y) from (respect. Ly) to Li, obtained by only modifying the y 
(respect, x) coordinate. Notice that these maps are invertible and denote their inverses 
by TTi^x (respect. TTi.j,). We also denote by MxIm (respect. My\M) the restriction ofMx 
(respect. Afy) to the square [~^, ^]^- 

With these notations, we define the measure A3 on L3 as 

^3 = <x{^x\m) a TTlyiMylAl) . 

The measure Af^ ^ is then defined as 

Kb = -^3 + 7r;jmM) - ^LA/'s) + Ti%(A/; - (a/;im)) . 

A straightforward calculation, using the symmetries of the problem, shows that prop- 
erty 1 is indeed satisfied. Property 3 follows immediately from the construction, so it 
remains to check that property 2 holds, i.e. that 

AfsiLs) >l-b, 

for |6| < 1, and AfaiL^) > otherwise. It follows from the definition of the total 
variation distance || • \\jv that 

MsiLs) = 1 - ^mx\M)-T:{Mx\M)hv , 

where Taix) ~ x — a. Since M > 46 > 4a, is clear from the picture and from the fact 
that the density of the normal distribution is everywhere positive, that Af^iL^) > for 
every a e R. It therefore suffices to consider the case |6| < 1. Since e'"'^^'^ dx < 
6/8, one has ||A/'a;|A/ — A/xWtv < b/A, which implies 

AA3(i3) > 1 - ^ - ^||AA, - t^MxWtv • 
A straightforward computation shows that, for \a\ < 1, 

\\Mx - t*M\tv < Ve-' - 1 < V2a , 
and the claim follows. □ 
We will use the following corollary: 

Corollary 5.14 Let W be the Wiener measure on W+, let g G L^(R+) with \\g\\ < b, 
let M = max{46, 21og(8/6)}, and define the map : W+ W+ by 

{'i>gw){t) = wit)+ f g{s)ds. 

Then, there exists a measure on such that the following properties hold: 
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1. Both marginals of\N^ ^ are equal to the Wiener measure W. 

2. Ifb<l, one has the bound 

({(!&„ Wy) I Wy = ^g(W^)}) >l~b. (5.38) 

Furthermore, at fixed b > 0, the above quantity is always positive and a decreas- 
ing fiinction of\\g\\. 

3. The set 

^{Wx,Wy) 3k : Wy{t) = Wx{t) + K, j g{s)ds, \K,\\\g\\ < 

has full i^-measure. 

Proof. This is an immediate consequence of the t? expansion of white noise, using g 
as one of the basis functions and applying Lemma 5.13 on that component. □ 

Given this result (and using the same notations as above), we turn to the construc- 
tion of the coupling function C for step 2. Given an initial condition Z = (xq, ya,Wx, Wy), 
remember that g^, is defined by (5.7). We furthermore define the function gy, : R_|- 
R" by 

~9n,{t) = C ^— ^ gy,{s)ds, (5.39) 

with C the constant appearing in (4.13). By (4.13), gw is the only function that ensures 
that (5.30) holds if Wx and Wy are related by (5.31). (Notice that, although (5.36) seems 
to differ substantially from (5.39), they do actually define the same function.) Given 
Z as above and a € A, denote by ga,z the restriction of gu, to the interval [0, 2''^] 
(prolonged by outside). It follows from Lemma 5.12 that there exists a constant K 
such that if the coupled process was in an admissible state at the beginning of step 1, 
then the a-priori estimate 

ha^zf^ I ll.9a,z(s)|pds<C2-2"A^ = 6^ (5.40) 



holds for some constant C. We thus define h = max{&Ar, j|.ga.z||} and denote by W| ^ 



the restriction of W^^ z b to the "good" set (5.38) and by W| ^ its restriction to the 



complementary set. 

We choose furthermore an arbitrary exponent (3 satisfying the condition 

/? > ■ ^5.41) 

1 — la 

With these notations at hand, we define the coupling function for step 2: 

T{Z, a) = 2^ , W2(Z, a) = W| x ,5,, + W|_^ x 5„„ , 

where 

a' (2,iV+ l,iV,0) , a" = (3,0,iV+ l,t,2^^7V''/^i-2"') , (5.42) 

for some constant to be determined in the remainder of this section. The waiting 
time in (5.42) has been chosen in such a way that the following holds. 
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Lemma 5.15 Let {Zq, oq) G x x A with Zq admissible and denote by T the 
measure on x obtained by evolving it according to the successful realisation 
of step 1, followed by N successful realisations of step 2, one failed realisation of 
step 2, and one waiting period 3. There exists a constant such that T -almost every 
Z = (x,y,Wx, Wy) satisfies 



1 



1 + C^^ 

\\x - y|| < 1 + , IC^(Z) < /C«(Zo) , , 

where N denotes the value of the corresponding component of aQ. 

Proof. We first siiow tiie bound on the cost function. Given Z distributed according 
to T as in the statement, we define by (5.33) as usual. The bounds we get on the 
function are schematically depicted in the following figure, where the time interval 
[t2, ta] corresponds to the failed realisation of step 2. 




(5.43) 

Notice that, except for the contribution coming from times smaller than ti, we are 
exactly in the situation of (4.12). Since the cost of a function is decreasing under time 
shifts, the contribution to JCa{Z) coming from (— oo, ti] is bounded by JCaiZo)- Denote 
by g the function defined by 

^ ^ 9w{t + h) fort G [0,^3-^1], 

1 otherwise. 

Using the definition of the cost function together with Proposition 4.4 and the Cauchy- 
Schwarz inequality, we obtain for some constants Ci and C2 the bound 



JCa.{Z)<JCcAZo) + CiJ\h\^"--^-\ti\^H-^g\\+C2 



where || • || denotes the L^-norm. Since step 1 has length 1 and the A^th occurrence of 
step 2 has length 2^~^, we have 

|i3 - iiH 2"^+' . 1^3 1 = i2''^7V4/(i-2«) . 

In particular, one has |t3| > jts — ii| if is larger than 1. Since 



\hV"-^-\ti\^"-^ < |t3|^-^|t3-il|^ < 



t3 
t3 - U 
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this yields (for a different constant Ci ) the bound 



ICa(Z) < /C„(Zo) + Ci 



- U 



5lU <'*C„(Zo) + Ci- 



3 Q^—'yN 



iV2 



where we defined 7 = (/? — l)(i — a). Notice that (5.41) guarantees that 7 > a. 

We now bound the || • Ha-norm of g. We know from Lemma 5.12 that the contri- 
bution coming from the time interval [<i, t2] is bounded by some constant K. Further- 
more, by (5.40), we have for the contribution coming from the interval [t2, ta] a bound 
of the type 



t2 



{s)fds < C{N+lf , 
for some positive constant C. This yields for g the bound 

ILglU <C(7V + i)2"^, 

for some other constant C. Since 7 > a, there exists a constant C such that 

/Co(Z) < /C„(Zo) + . 

By choosing sufficiently large, this proves the claim concerning the increase of the 
total cost. 

It remains to show that, at the end of step 3, the two realisations of (SDE) didn't 
drift to far apart. Define gi, by (5.34) as usual and notice that, by construction, xt — yt 
for t = i-z- Writing as before Qt = yt — Xt, one has for t > 12 the estimate 



\Qt\\ < 



-Cf (t-s)| 



gb(s)\\ ds 



(5.44) 



t2 



We first estimate the contribution coming from the time interval [t2, t^]. Denote by 
g ■ [^2,^3] R" the value g^i would have taken, had the last occurence of step 2 
succeeded and not failed (this corresponds to the dashed curve in (5.43)). Defining 
g = gw — g, we have by (4. lie) that, on the interval t e [t2 , ^3], 



gtit) = Ci 



(t-t2p 



C2 



t2 (i - s) 2 



■ ds 



(5.45) 



By Corollary 5.14 and the construction of the coupling function, g is proportional to 
gw and, by (5.40), we also have for g a bound of the type \\g\\ < C(N + 1) (the 
norm is the L-^-norm over the time interval [^2,^3])- Furthermore, (5.37) yields || ^|| < 
C(A^ + 1)2"^. Recall that every differentiable function defined on an interval of length 
L satisfies 
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(The norms are L^-norms.) Using this to bound the first term in (5.45) and the Cauchy- 
Schwarz inequality for the second term, we get a constant C such that gf, is bounded 
by 

\\gb{t)\\ < C(N + 1)(1 + 2-^(t - fa)"^"^) . 

From this and (5.44), we get an other constant C such that \\gt\\ < C(N + 1) at the 
time t = t^i. We finally turn to the interval [t^, 0]. It follows from (4.11c) that, for 
some constant C, we have 

\\9bm<^ + C\t-h\"-^g\\, 

where the term ^ is the contribution from the times smaller than ti . Since we know 
by (5.40) and Corollary 5.14 that the L-^-norm of g is bounded by C(N + 1) for some 
constant C, we obtain the required estimate by choosing sufficiently large. □ 

Remark 5.16 To summarise this subsection, we have shown the following, assuming 
that the coupled system was in an admissible state before performing step 1 and that 
step 1 succeeded: 

1. There exists constants 5' G (0, 1) and K > such that the A^th consecutive 
occurrence of step 2 succeeds with probability larger than max{6' , 1 — K2^°''^}. 
This occurrence has length 2^^^^. 

2. If the A^th occurrence of step 2 fails and the waiting time for step 3 is chosen 
longer than i^2^^ N'^^^^~'^"\ then the state of the coupled system is again ad- 
missible after the end of step 3, provided that the cost /Cq(Z) at the beginning of 
step 1 was smaller than 1 — . 

3. The increase in the cost given between the beginning of step 1 and the end of 
step 3 is smaller than . 

2N 

Now that the construction of the coupling function C is completed, we can finally 
turn to the proof of the results announced in the introduction. 

6 Proof of the main result 

Let us first reformulate Theorems 1.2 and 1 .3 in a more precise way, using the notations 
developed in this paper. 

Theorem 6.1 Let H e (0, 1) \ {^}, let f and a satisfy A1-A3 if H < \ andAl, A2', 
A3 if H > i, and let 7 < maxQ,</f a(l — 2a). Then, the SDS defined in Proposi- 
tion 3.11 has a unique invariant measure Furthermore, there exist positive con- 
stants C and S such that, for every generalised initial condition p, one has 

WQQtP - Qm^IItv < 2M({||a;o|| > e**}) + Cf-^ ■ (6.1) 

Proof. The existence of follows from Proposition 3.12 and Lemma 2.20. Further- 
more, the assumptions of Proposition 2.18 hold by the invertibility of a, so the unique- 
ness of /i, will follow from (6.1). 
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Denote by Lp the SDS constructed in Proposition 3.11, and consider the self-coup- 
ling Q{pi,ijL^,) for ip constructed in Section 5. We denote by ixt,yt) the canonical 
process associated to Qin, /i*) and we define a random time foo by 

Too = infji > I = ysVs > t} . 

It then follows immediately from (4.2) that 

WQQuj- - Q^J-*\\^v <2P(foo >t) . 

Remember that Q(/i, was constructed as the marginal of the law of a Markov pro- 
cess with continuous time, living on an augmented phase space X. Since we are only 
interested in bounds on the random time foo and since we know that Xs = Vs as long 
as the coupled system is in the state 2, it suffices to consider the Markov chain (Z,i, t„) 
constructed in (4.8). It is clear that foo is then dominated by the random time Too 
defined as 

Too = inf{T„ \ Sm = 2y m > n} , 

where Sn is the component of Z„ indicating the type of the corresponding step. Our 
interest therefore only goes to the dynamic of r„ and Sn- We define the sequence of 
times t{n) by 

t(0) = 1 , t(n + 1) = inf{m > t{n) \ S„, = 1} , (6.2) 
and the sequence of durations Ar„ by 

= Tf(„+i) — Tt(n) , 

with the convention At„ = +oo if t{n) is infinite (i.e. if the set in (6.2) is empty). 
Notice that we set t(0) = 1 and not because we will treat step of the coupled pro- 
cess separately. The duration At„ therefore measures the time needed by the coupled 
system starting in step 1 to come back again to step 1. We define the sequence C„ by 

. _ . _ J -oo ifAT„ = +oo, 

to - u , u+i - I ^ otherwise. 

By construction, one has 

Too = Ti + sup C„ , (6.3) 

n>0 

so we study the tail distribution of the Ar„. 

For the moment, we leave the value a appearing throughout the paper free, we will 
tune it at the end of the proof. Notice also that, by Remarks 5.11 and 5.16, the cost 
increases by less than every time the counter N is increased by 1. Since the initial 
condition has no cost (by the choice (4.6) of its distribution), this implies that, with 
probability 1, the system is in an admissible state every time step 1 is performed. 

Let us first consider the probability of Ar„ being infinite. By Remark 5.11, the 
probability for step 1 to succeed is always greater than S. After step 1, the A^th occur- 
rence of step 2 has length 2^^^, and a probability greater than max{S', 1 — if 2^"^} 
of succeeding. Therefore, one has 

N 

P(At„ >2^)>dY[ max{6', 1 - . 

k=0 
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This product always converges, so there exists a constant > such that 

P(At„ = oo) > , 

for every n > 0. Since our estimates are uniform over all admissible initial conditions 
and the coupling is chosen in such a way that the system is always in an admissible 
state at the beginning of step 1, we actually just proved that the conditional probability 
of P(At„ = oo) on any event involving Sm and At,„ for m < n is bounded from 
below by p,. 

For At„ to be finite, there has to be a failure of step 2 at some point (see (4.7)). 
Recall that if step 2 succeeds exactly A'^ times, the corresponding value for Ar„ will be 
equalto2^+i,2^^(l+n)4Ai-2")forAf > OandtoU(l+nf/^^-^°''>foiN ^ 0. This 
follows from (5.12b) and (5.42), noticing that N in those formulae counts the number 
of times step 1 occurred and is therefore equal to n. We also know that the probability 
of the A^th occurrence of step 2 to fail is bounded from above by K2~°'^ . Therefore, 
a very crude estimate yields a constant C such that 

P((l + n)-4/<i^2a)^^^ > Ar„ 7^ oo) < X ^ 2-"'' . 

k>N 

This immediately yields for some other constant C 

P((l + n)-^/'-^-^°'^ATr, > T and Ar„ 7^ oo) < CT""/'^ . (6.4) 

As a consequence, the process ^„ is stochastically dominated by the Markov chain Cn 
defined by 

. f— 00 with probability p*, 

<;o - U , U+i - I + (n + l)4/<i-2"'p„ with probabiHty 1 - p*, 

where the p„ are positive i.i.d. random variables with tail distribution CT^°'/^ , i.e. 
P(P„ > T) 

With these notations and using the representation (6.3), Too is bounded by 

P(too > t) < P(Ti > t/2) + P(X^(n + l)^/<i-'"V > t/2) , (6.5) 

n=0 

where is a random variable independent of the p„ and such that 

P(n* = fc) =p*(l -p*)'' . (6.6) 

In order to bound the second term in (6.5), it thus suffices to estimate terms of the form 
X^n=o'^'^ + for fixed values of k. Using the Cauchy-Schwartz inequality, 

one obtains the existence of positive constants C and A^ such that 

k 

P(^(n + l)4/*i-2")p„ > t/2) < C{k + ifr'^'P . 

n=0 



(JJ^-a/13 if < 1^ 

1 Otherwise. 
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Combining this with (6.6) and (6.5) yields, for some other constant C, 

P(too >t)< P(ri > t/2) + Ct-'^"' . 
By the definition of step (5.4), we get for ti: 

P(ri > t/2) < ^{{\\xo\\ > e^"*/V2}) +/i.({||2/o|| > e^"*/V2}) . 

Since, by Proposition 3.12, the invariant measure /i* has bounded moments, the second 
term decays exponentially fast. Since a < min{^,H} and /3 > (1 — 2a)~^ are 
arbitrary, one can realise j ^ a/f3 for 7 as in the statement. 

This concludes the proof of Theorem 6.1. □ 

We conclude this paper by discussing several possible extensions of our result. The 
first two extensions are straightforward and can be obtained by simply rereading the 
paper carefully and (in the second case) combining its results with the ones obtained in 
the references. The two other extensions are less obvious and merit further investiga- 
tion. 

6.1 Noise with multiple scalings 

One can consider the case where the equation is driven by several independent fBm's 
with different values of the Hurst parameter: 

m 

dXt = f{Xt)dt + Y,<y^ dB^HS^). 

i=l 

It can be seen that in this case, the invertibility of a should be replaced by the condition 
that the linear operator 

a = cTi ® (72 ® . . . ® a™ : R™" ^ R" , 
has rank n. The condition on the convergence exponent 7 then becomes 

7 < min{7i, . . . ,7™} , 
where 7^ = maxQ<^f. a(l — 2q). 

6.2 Infinite-dimensional case 

In the case where the phase space for (SDE) is infinite-dimensional, the question of 
global existence of solutions is technically more involved and was tackled in [MN02]. 
Another technical difficulty arises from the fact that one might want to take for a an 
operator which is not boundedly invertible, so A3 would fail on a formal level. One 
expects to be able to overcome this difficulty at least in the case where the equation is 
semiUnear and parabolic, i.e. of the type 

dx = Axdt + F{x) dt + Q dBnit) , 

with the domain of F "larger" (in a sense to be quantified) than the domain of A 
and Bh a cylindrical fBm on some Hilbert space TL on which the solution is defined. 
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provided the eigenvalues of A and of Q satisfy some compatibility condition as in 
[DPZ92, Cer99, EHOl]. 

On the other hand, it is possible in many cases to split the phase space into a finite 
number of "unstable modes" and an infinite number of "stable modes" that are slaved 
to the unstable ones. In this situation, it is sufficient to construct step 1 in such a way 
that the unstable modes meet, since the stable ones will then automatically converge 
towards each other. A slight drawback of this method is that the convergence towards 
the stationary state no longer takes place in the total variation distance. We refer to 
[Mat02, KSOl, Hai02] for implementations of this idea in the Markovian case. 

6.3 Multiplicative noise 

In this case, the problem of existence of global solutions can already be hard. In the 
case H > 1/2, the fBm is sufficiently regular, so one obtains pathwise existence of 
solutions by rewriting (SDE) in integral form and interpreting the stochastic integral 
pathwise as a Riemann-Stieltjes integral. In the case iJ G (j,^),it has been shown re- 
cently [Lyo94, Lyo98, CQ02] that pathwise solutions can also be obtained by realising 
the fBm as a geometric rough path. More refined probabilistic estimates are required in 
the analysis of step 1 of our coupling construction. The equivalent of equation (5.18) 
then indeed contains a multiplicative noise term, so the deterministic estimate (5.20) 
fails. 

6.4 Arbitrary Gaussian noise 

Formally, white noise is a centred Gaussian process ^ with correlation function 

ms)m = c^it - s) = d{t - s) . 

The derivative of the fractional Brownian motion with Hurst parameter H is formally 
also a centred Gaussian process, but its correlation function is proportional to 

ChU - s) = \t - s\^"-\ 

which should actually be interpreted as the second derivative of |i — in the sense 
of distributions. 

A natural question is whether the results of the present paper also apply to differen- 
tial equations driven by Gaussian noise with an arbitrary correlation function C{t — s). 
There is no conceptual obstruction to the use of the method of proof presented in this 
paper in that situation, but new estimates are required. It relies on the fact that the 
driving process is a fractional Brownian motion only to be able to explicitly perform 
the computations of Section 5. 
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